Open Daniel-EST opened 5 months ago
I @Daniel-EST thank you for your interest in Grobid. Now that you mention it, I've scheduled the update on Sonoma 14.5 tonight on my M2 💦
Could you try the image lfoppiano/grobid:0.8.0-arm
here?
For information, you can follow up the background information about this image on #1014
@lfoppiano I got the exact same issue after updating to Sonoma 14.5. I tried using lfoppiano/grobid:0.8.0-arm
, as well as latest-crf unfortunately the issue persists.
@lfoppiano, just tested the lfoppiano/grobid:0.8.0-arm
version and the same problem persists. However, while going through the docker images tags lfoppiano/grobid:latest-full
worked as intended. Where do I get more information about the differences between both versions?
OK, so the latest-*
images are generally for development.
In this case you're lucky because it's the same as lfoppiano/grobid/0.8.0-full-slim
here
This image was built from the branch 0.8.0-fixes
which is the version 0.8.0 with post-release fixes.
could you try to run again the lfoppiano/grobid:0.8.0
or lfoppiano/grobid:0.8.0-arm
and specify the --init
param, and let me know if there is any change?
could you try to run again the
lfoppiano/grobid:0.8.0
orlfoppiano/grobid:0.8.0-arm
and specify the--init
param, and let me know if there is any change?
Running
docker run --init --ulimit core=0 -p 8070:8070 --name grobid lfoppiano/grobid:0.8.0-arm
The error persisted, with the same trace as described earlier. However, running:
docker run --init --ulimit core=0 -p 8070:8070 --name grobid lfoppiano/grobid:0.8.0-full-slim
I was able to upload an PDF file.
OK, I'm glad you have a working image, but I don't know why it works... Maybe something has changed in the way java is handled 🤔
About the lfoppiano/grobid:0.8.0-full-slim
, does it work well with more than one PDF? Could you run it without issues on let's say 1000 PDF documents?
I'm going to create a script and try to upload multiple PDF documents. However, I've noticed that the container is now more resource-intensive, consuming significantly more RAM and CPU. Some huge documents that I was able to parse in the past are now killing the container due to lack of resources.
Could you share the log and remind me which image are you using?
If you are using the lfoppiano/grobid:0.8.0-full-slim
make sure that you are using the Deep learning models without GPU.
@Daniel-EST did you manage to fix the issue? I've built a crf-only image that should support ARM, but I haven't tested. If you have time, it's on #1165 .
@lfoppiano Just tested the docker image lfoppiano/grobid:latest-crf-multi-arch
from #1165.
Everything worked well, I uploaded about 10 documents and all were correctly parsed.
docker run -d --ulimit core=0 --platform linux/amd64 -p 8070:8070 --name grobid lfoppiano/grobid:latest-crf-multi-arch
It is important to add the --platform linux/amd64
to force the usage of amd64 architecture. Otherwise you might get the following error:
rosetta error: failed to open elf at /lib64/ld-linux-x86-64.so.2
Tested on macOS Sequoia 15.0, on a M3 processor. Important to notice that the macOS version was updated since I opened the issue.
Thanks for testing @Daniel-EST .
I did some more tests, I think you should add also --init
or the child process won't be cleaned up. I tried without and the JVM was crashing.
After adding --init
it seems more stable, but I would need to test it a bit more.
I've added some documentation on the related branch, here.
Description:
After updating to macOS Sonoma 14.5, I've noticed that a Docker container from image
lfoppiano/grobid:0.8.0
is being killed shortly after sending a PDF file to Grobid. Before macOS Sonoma 14.5 version it used to work as intended. This is happening on both M3 and M1 processors.The container starts normally when run with the command:
However, it is terminated immediately afterwards uploading a PDF file with the following error message:
Environment: