Open frankiedrake opened 5 months ago
Hi - you will need to run your own server to get these capabilities. See instructions here: https://github.com/nlmatics/nlm-ingestor/pkgs/container/nlm-ingestor. The LayoutPDFReader is a bit of a misnomer, you can pass in different kind of documents and it will work the same way.
@ansukla I tried to use this to parse XML documents, but can't get it to work properly.
Tried hosting the server locally in the docker image, and passing XML documents through the LayoutPDFReader
. But the document was not parsed properly.
I also tried modifying some code to change the MIME type of the POST request to application/xml
and text/xml
in the api request to the same endpoint http://localhost:5010/api/parseDocument?renderFormat=all
, but that didn't work either.
Any examples of how to use this service to chunk XML documents would be great - thanks!
The documentation provides an example of using
LayoutPDFReader
class to process PDF documents. But it also says about various ingestors (XML, HTML, text, etc.) but not a single example of how can we use it and how it is connected to aLayoutPDFReader
. Maybe there's aLayoutTextReader
or something similar?