Open tech-deployment opened 6 days ago
I am also using the python client. For my set up, I am using multimodal with instructions and no caches. I have been trying to shorten my instructions however recently it seems the service is getting rather unstable in the processing time. My run can take up to 10 mins for a document of a few pages.
Hi Pierre, In our pod doing the actual processing of the file, I see:
00:12:58.441 Processing job job_record_id: 11d692d2-c10b-4323-a1b6-659ce705b9a6
00:13:22.149 [job_record_id=11d692d2-c10b-4323-a1b6-659ce705b9a6] job result done
There's no way you can have a non-cached result in 2-4 seconds in accurate mode. (Although I wish it was the case). 26 seconds seems reasonable. Please send me your JobID when the job hangs forever.
Describe the bug I am trying to understand why do I get so much difference to get a parsing response between:
Job ID JobId using LlamaCloud: 11d692d2-c10b-4323-a1b6-659ce705b9a6 JobId using LlamaParse Client: 27be0d32-3923-438b-b782-51fbf39e5695
Additional context I have been using for both the same option: -> Accurate Mode -> Both are not using the cache
Some Numbers: On LlamaCloud the average processing time is around 2/4 seconds. Using LlamaParse python client I am getting an average of 26 seconds (when it complete sometime it's hang forever).
Is there any setup to be done on the Python Client to get up to speed?
Thank you
Pierre