Open Enaouram opened 2 months ago
Is it a scanned pdf? If yes, you might wanna use the OCR option (add &applyOcr=yes
in the api url).
The pdfs I tried parsing are not scanned but I tried adding the '&applyOcr=yes' and still haven't got the chunks/sections of the document, here's what I got in my cmd once I've tried adding the '&applyOcr=yes' in the api url :
Hey everyone, I have a problem with the locally hosted llmsherpa api, I've followed every step on https://github.com/nlmatics/nlm-ingestor but still can't get my documents chunked once I'm connected to the endpoint, idk what's the issue