Open wcooper90 opened 4 years ago
How are you calling it?
Hi Chris,
We've tried: text = pytesseract.image_to_string(img).encode('latin-1', 'ignore')
As well as executing from the command line and then reading it from a file:
os.system("tesseract -l eng /var/app/current/inputs/" + str(i) + ".png text")
Thanks for getting back so quickly.
I meant how are you calling punctuator. That code only appears to call tesseract.
Sorry!
Here is the function we are using punctuator in:
def punctuate_transcript(text):
p = Punctuator('Demo-Europarl-EN.pcl')
return p.punctuate(text)
We import Punctuator at the top of the file with:
from punctuator import Punctuator
and I've made sure to download the model, Demo-Europarl-EN.pcl, to the right place, both locally and on AWS.
I meant a complete script to reproduce the issue. Try this:
cd /tmp
mkdir test
cd test
virtualenv -p python3.7 env
pip install punctuator
python
>>> from punctuator import Punctuator
Does that throw an import error?
With or without the virtualenv, it does not throw an import error. Do you think we can use the os package to run Punctuator in Python from the command line within our application?
I'm not sure I understand your question. If you mean calling punctuator via os.system(), I suppose that could work, but that's a complicated workaround to what should be a simple problem to fix.
If your application is running inside the virtualenv where punctuator is installed, it'll just work and you should need to call punctuator it via os.system. It looks like it's throwing an import error because you simply haven't installed punctuator.
If you're somehow calling punctuator from Python running os.system("tesseract...")
, then you need to make sure that Python instance is inside the virtualenv where punctuator is installed. Then the process called from os.system should inherit the path.
I've been having the same issue when importing it via python with from punctuator import Punctuator
. I attempted to install it as you suggested above, and then run that command in Python but it results in the error that's mentioned earlier in this thread. Help would be appreciated since it would be great to check this out
I'm having the same problem
If someone could provide a script that reproduces the problem, then I could probably fix it. However, I can find no problems on my end. I even have a Travis build that installs the package and runs some unittests.
Closing this as not-reproducible, but feel free to re-open if you can document steps to reproduce.
Hi! Happy NY and Merry Christmas :) To replicate you may run anything except python in CMD (uvicorn, celery, etc). If you run python from CMD - everything fine.c
Dockerfile: FROM python:slim RUN pip3 install punctuator fastapi uvicorn COPY main.py ./app/main.py CMD uvicorn --host 0.0.0.0 app.main:app
app/main.py: from punctuator import Punctuator
from fastapi import FastAPI app = FastAPI() @app.get("/") async def versions(): return "something"
RUN
docker build . -t punctuator
docker run -ti punctuator # error will occur on load
Error occured
File "./app/main.py", line 1, in
@evios Thanks. I can reproduce this. I can also reproduce this if I use a normal venv in Ubuntu. However, it seems to be a bug in uvicorn, not this package. That's why I couldn't reproduce this earlier, as I was only testing with a normal Python shell.
If I add import sys; print(sys.path)
to my __init__.py
and then run your uvicorn code, I see:
['.', '/home/chris/git/punctuator2/test/.env37/bin', '/usr/lib/python37.zip', '/usr/lib/python3.7', '/usr/lib/python3.7/lib-dynload', '/home/chris/git/punctuator2/test/.env37/lib/python3.7/site-packages']
However, if I run a normal Python shell and then do the same import, I see:
['', '/usr/lib/python37.zip', '/usr/lib/python3.7', '/usr/lib/python3.7/lib-dynload', '/home/chris/git/punctuator2/test/.env37/lib/python3.7/site-packages']
So for some odd reason, it looks like uvicorn is adding the standard bin directory as a place to look for packages, and this is breaking because I have a bin script with the same name as the package. So it tries to import the bin script, which obviously isn't a package, causing the ModuleNotFoundError.
I don't think this behavior in uvicorn is correct. It should not be looking for packages in the virtualenv's bin directory. Therefore, I don't think there's anything I can do on my end, short of changing my names to conform to uvicorn's non-standard behavior, which isn't good practice.
Correct me if I'm wrong.
Also, as a workaround, if you remove the bin directory from sys.path before you import punctuator, that should fix it.
Thanks @chrisspen ,It worked.
import sys
sys.path.remove('/root/miniconda3/envs/xxx/bin')
import punctuator
Hi:) Also can confirm that removing bin dir fixed. I noticed such strange behaviour not only for uvicorn, but for celery as well. uvicorn --host 0.0.0.0 app.main:app celery worker -A app.worker As you can see, while this is python you cant run then directly as binary packages. Hence, one more workaround (if you run it in Docker) is to start (uvicorn, celery) with: CMD python -m uvicorn --host 0.0.0.0 app.main:app instead of CMD uvicorn --host 0.0.0.0 app.main:app
In such run scenario everything good. Thank you @chrisspen for packaging it in pip! Have a great day!
Hi, I'm very interested in using Punctuator but my configuration skills are not up to fixing the import work-around mentioned in the previous posts.
I have these system paths:
['/mnt/c/PythonProgrammes/venv', '/usr/lib/python37.zip', '/usr/lib/python3.7', '/usr/lib/python3.7/lib-dynload', '/usr/local/lib/python3.7/dist-packages', '/usr/lib/python3/dist-packages']
(running Python 3.7 on Ubuntu 18.04 LTS)
I have tried to removing '/mnt/c/PythonProgrammes/venv' with:
sys.path.remove('/mnt/c/PythonProgrammes/venv')
But my installed_packages_list does not include punctuator.
Any help appreciated.
I'm currently trying to create a webapp, Punctuator being an important package for it. I'm using AWS, which is "a distribution that evolved from Red Hat Enterprise Linux (RHEL) and CentOS," but I'm not sure about specifics. I'm on Python 3.7.9, and these are the errors that come out -
~ File "/var/app/venv/staging-LQM1lest/bin/punctuator.py", line 5, in
~ from punctuator.punc import command_line_runner
~ ModuleNotFoundError: No module named 'punctuator.punc'; 'punctuator' is not a package
I installed puncuator 0.9.6 into the virtual environment venv via a requirements.txt file off of github, with the following command:
sudo pip3 install -r https://raw.githubusercontent.com/wcooper90/summarization/master/backend/requirements.txt
I also have Punctuator installed on Amazon Linux 2 with just pip3 install puncuator.
I'm wondering if there are some dependency issues, or if it may have to do with the OS?
Thanks for any help.