I have used llmsherpa to process this PDF.
This is a Network Protocol Specification document.
I have utilized the demo provided by you in Colab.
It does not get any error.
When I convert it to text, it is converting only a portion of the pdf. Essentially it is missing lots of information.
I utilized both pdf url and local pdf file path.
I printed all the section titles and the output does not match the pdf. Output is provided here.
I also converted the pdf to text and it is significantly smaller. Converted text file is here.
My main concern: is there any particular reason why llmsherpa might not work for Network Protocol Specification Pdf documents?
I have used llmsherpa to process this PDF. This is a Network Protocol Specification document.
I have utilized the demo provided by you in Colab.
It does not get any error. When I convert it to text, it is converting only a portion of the pdf. Essentially it is missing lots of information. I utilized both pdf url and local pdf file path.
My main concern: is there any particular reason why llmsherpa might not work for Network Protocol Specification Pdf documents?