replicate / replicate-python

Python client for Replicate
https://replicate.com
Apache License 2.0
739 stars 209 forks source link

Prediction interrupted; please retry (code: PA) #135

Closed Soykertje closed 6 months ago

Soykertje commented 1 year ago

When I try to consume one of my private endpoints like this: result = replicate.run("my-user/model-name:version", input={"file": open(file_path, "rb")})

I'm getting that exception but when I execute it from the website it works with no problem.

mattt commented 1 year ago

Hi @Soykertje. Can you please share a link to a prediction that failed and one that succeeded? Or else, share the full error message / logs, or anything else that we could use to debug your problem?

awerks commented 1 year ago

I aslo got this problem upon executing via API. The website works normally. This is the only message it shows, there are no other logs. It happens with large files > 20 MB File "C:\Users\susti\AppData\Local\Programs\Python\Python311\Lib\site-packages\replicate\client.py", line 140, in run raise ModelError(prediction.error) replicate.exceptions.ModelError: Prediction interrupted; please retry (code: PA)

mattt commented 1 year ago

@awerks @Soykertje On the website, files are uploaded to a URL which is passed as an input to the prediction. When you set a file as an input through a client library, the file is sent in the request body as a base64-encoded data: URI. Replicate's API enforces limits on the size of requests, which may be causing this error.

awerks commented 1 year ago

@mattt is there a workaround to this?

Soykertje commented 1 year ago

Hello @mattt since they are predictions made to a private endpoint, apparently you'll be unable to access them, in the case you're able to access them, here they are: worked and failed

Those are my predictions. As you can notice, it's a custom implementation of whisper (since the implemented endpoint by Replicate does not include some kwargs that I need and uses an old version). The prediction that worked looks like this and when a prediction failed related to this problem, looks like this.

mattt commented 1 year ago

@awerks Yes, you can upload the file to a URL and pass that URL as the input for your prediction.

@Soykertje Thanks for sharing those. I'll take a look.

Soykertje commented 1 year ago

Hi @awerks appparently the problem with the package size limitation in the API's web server is known #68, @bfirsh purposes a workaround, I haven't tried but apparently you can send them a link that holds the file and the endpoint should download and use that file.

awerks commented 1 year ago

@Soykertje, it works via a URL, but it the downloading time is obviously billed

Soykertje commented 1 year ago

@awerks Do you have any idea of how slow? I'm planning to use files that in the 95 % (or above) of the cases will not be bigger than 100 MB, so depending on the download speed, some billed seconds are still worth.

awerks commented 1 year ago

@Soykertje It is not slow, but it adds up. 5-7 seconds each time could be a deal-breaker for someone

Soykertje commented 1 year ago

@awerks Uhhhh I made some math and yeah, absolutely... Depending on the volume of requests, the cost of those seconds can end up being something to pay attention.

brassel commented 1 year ago

Hi replicate team, thanks for this great platform!

I have stumbled across the same issue and for me reducing the size of the input files was very easy. After that it worked just fine. But a more informative error message would really be appreciated, also by others I am sure.

You could also point me to the code where this is defined and I could make improving the error my first contribution, maybe?

Thank you and kind regards

neal-qualitative commented 1 year ago

@mattt Passing in a url as input instead of a base64 encoded file is working for me. However, the files that my platform is processing cannot be accessible through a public url. How can we identify replicate in order to restrict access to model input files?

For example, if I'm storing model input files in s3, is there a replicate s3 user that I can grant access to?

brassel commented 1 year ago

@neal-qualitative, in GCloud, I can create a "signed URL" for short term/single use access purposes like yours. Maybe something like that is possible in AWS and would be a solution for you?

neal-qualitative commented 1 year ago

Thanks for the suggestion @brassel! My platform can create a presigned url and send that to replicate.

ryx2 commented 10 months ago

I just got this issue, and all of the inputs were strings. the code does involve downloading some files from s3, but those files are collectively about 40 mb, and the compute time is about 70s before failing with the exact error mentioned above. somehow about 1/2 of the jobs (about 20) i created in the last 5 minutes ran into this exact error. this is not something I've seen before right now, even for this same model and version.

ancri commented 6 months ago

Is there an update on what kind of image size is expected to cause an issue, and what size we should expect is safe to pass as base64 encoded?

sachsom95 commented 6 months ago

I deploy it as an S3 presignedURL and it solves the issue for me

Soykertje commented 6 months ago

@ryx2 Have you tried running the prediction from the web? The web uses the API too so what it does is upload the files into an internal CDN and sends the files as download links to the model... In other words, it's basically no difference between sending links using the API or running it from the web, if the prediction fails from the web too, consider opening a new issue because your problem would not be related to sending files in the body.

Soykertje commented 6 months ago

Closing the issue since this was related to send files too large (~> 20 MB) in the body of the request, the solution is to send links to download the files, the COG implementation should automatically handle the download if the link correctly points to the file... This workaround should not be very different in terms of prediction times, consider that the web also uploads the files to a CDN and send the links to perform the prediction.

In case of using any object storage such as S3, a signed URL would do the job if the files are private, if not, a normal URL pointing to the file will suffice, this applies to the majority of object storage services.

If anybody experiments this same problem but not related to send files in the body of the request, please consider opening a new issue providing the information related to your specific issue.