nlmatics / nlm-ingestor

This repo provides the server side code for llmsherpa API to connect. It includes parsers for various file formats.
https://www.nlmatics.com
Apache License 2.0
922 stars 112 forks source link

Suggestions for Fast Production Server #37

Open yashpatel21 opened 3 months ago

yashpatel21 commented 3 months ago

I have instantiated my own nlm-ingestor api service on a Dedicated 8GB Linode instance (for testing purposes) using the provided docker container.

I have some questions regarding building a fast production server for parsing PDFs. I have code based on the getting started example provided, which sends this file:

https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10q/uber_10q_march_2022.pdf

to the nlm-ingestor api to get parsed and retrieve the chunks. This process, just for the above file, takes ~30 seconds. This is indeed faster than some other options, but for my use-case I need to be able to bring that time down to ~10 seconds. Are there any provided guidelines or suggestions for improving the speed of the PDF parsing service?

pashpashpash commented 3 months ago

Why does a simple file processing task take 30+ seconds? I am experiencing the same thing.

pashpashpash commented 3 months ago

cc @ansukla please advise. In its current state, this is unusable for a production use-case. Chunking files should take 5 seconds max.

ansukla commented 3 months ago

It shouldn't take that long unless you are using OCR. OCR takes time. If you are not using OCR, try to get a better server with faster CPU and more memory. 30 pages should be done in about 5s.

pashpashpash commented 3 months ago

@ansukla what specs would you recommend? I'm currently running the container on a 2GiB 1CPU pod.

Screenshot 2024-03-31 at 4 34 33 PM
pashpashpash commented 3 months ago

Update: It still takes 30s even with 4 CPUs for this file: https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10q/uber_10q_march_2022.pdf