decred / dcrpool

decred mining pool
ISC License
31 stars 28 forks source link

pool: Don't give chainstate entire process cancel. #378

Closed davecgh closed 1 year ago

davecgh commented 1 year ago

The pool is currently designed to shutdown the entire process deep down in the chainstate when some errors it deems to be fatal occur. It accomplishes this by passed the cancel func for the entire process all the way down to the chainstate. Unfortunately, this is not a good design because goroutines deep in the guts of subsystems should not have the ability to directly pull the rug out from under the under process without the upper layers having a chance to do anything about it.

Moreover, some of those errors could actually be temporary errors such as temporary loss of connection to the database or dcrd which then result in the entire process shutting down.

This is a first pass at improving the situation slightly by no longer passing the entire process's cancel func down the chainstate handler and instead arranges for the chainstate handler to return an error back to the hub's Run method which itself creates its own child context that is then canceled to cause the hub to shutdown.

It also corrects a few other issues along the way such as:

For now, the same process shutdown behavior is maintained by having the main process shut itself down upon observing the hub shutdown which ensures all other subsystems shutdown properly too.

In other words, the overall behavior is the same, aside from the aforementioned fixes, but this change makes it feasible to further improve the behavior in the future so the pool won't just die due to what are likely temporary failures.