Open krynju opened 2 years ago
It seems like various failures when using Distributed will cause a cascade of failures in Dagger. We should try to wrap every call into Distributed APIs with a "rollback" function, which will attempt to recover from failure, or at least gracefully kill the scheduler and all hanging fetch
/wait
calls.
Another one found when benchmarking, also pretty rare. For later, will probably be useful when looking at stability