pendulum-chain / spacewalk

Apache License 2.0
35 stars 7 forks source link

Rewrite Task of listening Stellar Messages #545

Closed b-yap closed 1 month ago

b-yap commented 3 months ago

Summary

We need to fix the vaults from getting unknowingly stuck, potentially caused by running unhandled zombie tasks.

One idea is to include the polling of the stellar messages: https://github.com/pendulum-chain/spacewalk/blob/7c7989875e95e1cfe3b0aeba9bb6af01d9e33e58/clients/vault/src/oracle/agent.rs#L100-L104 in the monitoring: https://github.com/pendulum-chain/spacewalk/blob/7c7989875e95e1cfe3b0aeba9bb6af01d9e33e58/clients/vault/src/system.rs#L806-L817

This means updating the OracleAgent's messagesender is delayed; passing the OracleAgent to tasks must be mutable; hence using **Arc<RwLock<>>_** instead of Arc<> alone.

But we cannot have these current tasks STARTING TOGETHER WITH the polling task. The Stellar-overlay has to run already, and all open requests MUST finish first. https://github.com/pendulum-chain/spacewalk/blob/7c7989875e95e1cfe3b0aeba9bb6af01d9e33e58/clients/vault/src/system.rs#L787-L790

An idea is to introduce another variant of the ServiceTask, where it waits for something to finish before a task starts. Prechecking will be required.

enum ServiceTask {
    ...
    // Runs a task after a prequisite check has passed.
    PrecheckRequired(Task),
}

And to make sure the stellar-overlay and the client are communicating well, stellar-overlay will also send to client the:


How to start reviewing:

stellar-relay-lib

vault

I will add relevant comments when necessary.

b-yap commented 1 month ago

@ebma @gianfra-t CI passed. Merging this.