Closed MaxiBoether closed 2 weeks ago
@MaxiBoether Not fully what the problem with partial result states is, but have you considered using retry contexts?
@MaxiBoether Not fully what the problem with partial result states is, but have you considered using retry contexts?
Ah, nope. I guess I can refactor the code to retry contexts then
Attention: Patch coverage is 71.59091%
with 25 lines
in your changes missing coverage. Please review.
Project coverage is 82.50%. Comparing base (
75868bb
) to head (ac1d863
).:exclamation: Current head ac1d863 differs from pull request most recent head 19cde94
Please upload reports for the commit 19cde94 to get more accurate results.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@MaxiBoether Not fully what the problem with partial result states is, but have you considered using retry contexts?
Thanks for suggesting that. I adapted the code. However, I needed to manually catch and reraise exceptions sometimes because I find their callback system to be quite ugly in object-oriented contexts (we need to write a static function that recovers self
out of args[0] etc). I think this actually looks better than using callbacks, and it allows for easier logging with our logging infra. The retry logic itself is still more hidden now :)
( % to main) ( % to main)
Sometimes, we face outages/random disconnections during training. This fixes it in places where I encountered it last night. I tried to integrate
tenacity
as suggested by @robinholzi, but it's not always possible since the retry logic involves keeping track of already done work, which I don't want to put into class statePart 6/n of porting over SIGMOD changes.