Closed Odrec closed 1 week ago
Hi @odrec, could you look for the logs in the grobid server? Which type of documents are you processing?
The process_pdf
should return a tuple with two elements, the status code and the content, so you should check the status code == 200 before trying to access the data.
Hi @Odrec, could you look for the logs in the grobid server? Which type of documents are you processing?
The
process_pdf
should return a tuple with two elements, the status code and the content, so you should check the status code == 200 before trying to access the data.
Hey! This is what process_pdf returned
('/tmp/tmpbweiaosz.pdf', 408, None)
I'll try to get the logs now and report back
Also, I've tried only with scientific papers. For example this one fails everytime for me.
Hi @Odrec, could you look for the logs in the grobid server? Which type of documents are you processing?
The
process_pdf
should return a tuple with two elements, the status code and the content, so you should check the status code == 200 before trying to access the data.
I'm running in a docker container. How can I access the logs? I don't see the logs directory
root@9fbe59fc8f3f:/opt/grobid# ls
data delft grobid-home grobid-service preload_embeddings.py resources-registry.json
@Odrec you should get the log in the docker console and at least the error, if any. Could you process the same PDF from the grobid interface?
408 indicate request timeout, so it might be that you are having other issues related to your network.
Somehow this works with my remote server but not if I run grobid locally on my laptop so I'll close it for now while I see what is the problem on my local instance. Thanks for the help!
Operating System and architecture (arm64, amd64, x86, etc.)
Ubuntu Linux x86_64
What is your Java version
openjdk 11.0.24 2024-07-16
Log and information
No response
Further information
This is the error shows up sometimes with my app for the same pdfs that were working before. I'm not sure why it works sometimes and not others. This is the relevant part of the code where it tries to extract the entire text but it returns None as the output.
I restarted the grobid server several times and still gives me the same error but it was working yesterday. Does anyone know why the result from the