Closed roimulia2 closed 1 year ago
Hi @roimulia2. Yes, polling for completion is almost always going to be slower than responding to pushed web hooks. But polling can be made more or less responsive.
By default, the client uses exponential backoff retry logic:
You can override this to use constant backoff (i.e. "wait n seconds each time") by setting the retryPolicy
of the client.
Looking at this now, I think this functionality is under-documented, and defaulting to exponential is questionable. And it'd be nice to make this adaptive to some Retry-After
value sent by the server. Or even better, we should create a paved path for updating predictions from web hooks delivered through push notifications.
Hey @mattt, thank you for the fast response. What I've done in the past in my own Replicate client is to re-send a request every time I get a response from your servers until it's either failing or succeeding. I'll try to use an aggressive retryPolicy to see if it feels better.
Can you recommend me an appropriate RetrtPolicy that might fit my needs?
@roimulia2 I can't make any specific recommendations, but it all depends on the performance characteristics of the models you're running and the constraints of your mobile client. If predictions are significantly longer to complete compared to web hooks, you could try setting a constant retry to 1s.
Hey @mattt ! I'm using it mainly for Stable Diffusion which is short on the web (2-3s), and also user expectations is around that time frame. Is setting the constant to 0.2 too low?
replicate.retryPolicy = .init(strategy: .constant(duration: 0.2, jitter: 0), timeout: nil, maximumInterval: nil, maximumRetries: nil)
@roimulia2 In that situation, I think a 0.5s timeout would be a good fit.
Got it, thanks!
First of all, thank you for this great repo!
I'm using the following code to create a prediction:
Benchmarking against the actual inference time in Replicate itself (Web GUI), it seems like the iOS client is slower. I wonder if maybe it's related to the retry policy when the
wait: true
is set? Maybe the re-tries are too slow? Or it should be the same as using a webhook?