Closed pcaselas closed 5 months ago
I'm not using SWIG
Before installing Swig I was getting this error:
PyMuPDF/setup.py: Finished building mupdf.
PyMuPDF/setup.py: sys.platform='darwin'
PyMuPDF/setup.py: library_dirs=['mupdf-1.20.3-source/build/release', 'mupdf-1.20.3-source/build/release']
PyMuPDF/setup.py: libraries=['mupdf', 'mupdf-third']
PyMuPDF/setup.py: include_dirs=['mupdf-1.20.3-source/include', 'mupdf-1.20.3-source/include/mupdf', 'mupdf-1.20.3-source/thirdparty/freetype/include']
PyMuPDF/setup.py: extra_link_args=[]
running install
running build
running build_py
running build_ext
building 'fitz._fitz' extension
swigging fitz/fitz.i to fitz/fitz_wrap.c
swig -python -o fitz/fitz_wrap.c fitz/fitz.i
error: command 'swig' failed: No such file or directory
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure
× Encountered error while trying to install package.
╰─> PyMuPDF
note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.
That's why I did brew install swig
For me it works based on instructions from README.
Are you running it on an ARM Mac?
Seems like it does install if I upgrade paddleocr to 2.7.3, however:
python api.py
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.0.0 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.
If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.
I'm using Intel Mac.
Numpy error - this is Python dependency issue. Solve it by specifying lower Numpy version in requirements.txt:
numpy==1.26.4
That solves the dependencies problem, however when I run
curl -X 'POST' \
'http://127.0.0.1:8001/api/v1/sparrow-ocr/inference' \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F 'file=' \
-F 'image_url=https://raw.githubusercontent.com/katanaml/sparrow/main/sparrow-ml/llm/data/inout-20211211_001.jpg'
it seems to enter an infinite loop in here...result = model.ocr(bytes_data, cls=True)
and never responds
Tried uploading the file instead of image_url and the same thing happened.
It seems it's not compatible with Apple M series chips.
Yes, I think you are right about M chips. Here is the discussion, related to PaddleOCR and M chips: https://github.com/PaddlePaddle/PaddleOCR/issues/11706
I think the easiest solution to wrap this service into Docker container.
Here's the swig version that I have:
And here's the error message: