Our scheduler replaces the old EOSIO deferred transactions.
When deferred actions fail, they need to be removed from the list - otherwise the action never executes, and is never removed from the list because the entire executenext action fails. So we end up with an infinite loop trying to execute an action that has an error.
We accommodate timeout errors by trying a few times.
We also catch "nothing to execute" errors.
For all other errors, we remove the action from the scheduler table, since it cannot be executed.
Ideally there won't be any invalid actions in the scheduler's list, but we saw the error now when the contract is trying to call close prop on staged proposals, causing an error.
We never saw these errors in the past, they were on deferred transactions and failing silently. The fix is to remove the scheduled close doc prop action for staged proposals, but the scheduler needs to deal with failing actions anyway.
This required changes in the contract, the permissions, and the scheduler script.
Contract changes
Added removedtx action
Permissions changes
Added scheduler permission to dao.hypha
Link scheduler permission with removedtx action
Script changes
Catch errors in actions, and call removedtx in case a scheduled action fails.
[ ] Telos Testnet - re-enable backend script once permissions are updated. It is now stopped so it doesn't spam errors.
Note: Backend deployed
Telos Testnet - waiting for keys from Gery - need owner key to add permission
Telos Mainnet - deferred transactions still work, we need to switch to new contract by msig: Deploy new contract and change permissions.
Error handling on deferred transactions
Our scheduler replaces the old EOSIO deferred transactions.
When deferred actions fail, they need to be removed from the list - otherwise the action never executes, and is never removed from the list because the entire executenext action fails. So we end up with an infinite loop trying to execute an action that has an error.
We accommodate timeout errors by trying a few times.
We also catch "nothing to execute" errors.
For all other errors, we remove the action from the scheduler table, since it cannot be executed.
Ideally there won't be any invalid actions in the scheduler's list, but we saw the error now when the contract is trying to call close prop on staged proposals, causing an error.
We never saw these errors in the past, they were on deferred transactions and failing silently. The fix is to remove the scheduled close doc prop action for staged proposals, but the scheduler needs to deal with failing actions anyway.
This required changes in the contract, the permissions, and the scheduler script.
Contract changes
Added removedtx action
Permissions changes
Added scheduler permission to dao.hypha Link scheduler permission with removedtx action
Script changes
Catch errors in actions, and call removedtx in case a scheduled action fails.
Other changes in this PR: Some cleanups.
See also https://github.com/hypha-dao/hypha-smart-contracts/pull/26 https://github.com/JoinSEEDS/hypha-accept-payments/pull/28
Deploy Status
Note: Backend deployed Telos Testnet - waiting for keys from Gery - need owner key to add permission Telos Mainnet - deferred transactions still work, we need to switch to new contract by msig: Deploy new contract and change permissions.