Open maxupp opened 8 months ago
Hi @maxupp !
If you process just one file, client.process_pdf()
returns the response in memory and you can just parse it with a python XML parser.
If you process files in batch, instead of writing the server responses in files on disk you can change the behavior here: https://github.com/kermitt2/grobid_client_python/blob/master/grobid_client/grobid_client.py#L228
Or do I misunderstand the issue?
The idea of this client is to provide a simple basis (only dependencies on standard python libraries) that can be extended as needed.
The fact that output can only be written to files and not kept in memory for further processing is a major drawback. I suggest returning a dictionary with all the TEI objects.