textract.exceptions.ShellError: The command antiword is not installed on your system. Please make sure the appropriate dependencies are installed before using textract #444
Can not execute antword In production by Gunicorn while in Development on same computer it work
i have install all dependences on Ubuntu before installing textract here is the link hereReading package lists... Done Building dependency tree Reading state information... Done Note, selecting 'python-dev-is-python2' instead of 'python-dev' libjpeg-dev is already the newest version (8c-2ubuntu8). antiword is already the newest version (0.37-16). flac is already the newest version (1.3.3-1build1). lame is already the newest version (3.100-3). libmad0 is already the newest version (0.15.1b-10ubuntu1). libsox-fmt-mp3 is already the newest version (14.4.2+git20190427-2). pstotext is already the newest version (1.9-6build1). python-dev-is-python2 is already the newest version (2.7.17-4). sox is already the newest version (14.4.2+git20190427-2). swig is already the newest version (4.0.1-5build1). tesseract-ocr is already the newest version (4.1.1-2build2). unrtf is already the newest version (0.21.10-clean-1). libxml2-dev is already the newest version (2.9.10+dfsg-5ubuntu0.20.04.4). libxslt1-dev is already the newest version (1.1.34-4ubuntu0.20.04.1). poppler-utils is already the newest version (0.86.1-0ubuntu1.1). ffmpeg is already the newest version (7:4.2.7-0ubuntu0.1). 0 upgraded, 0 newly installed, 0 to remove and 44 not upgraded.
The following work is done on same server
when i run gunicorn -b 0.0.0.0:8000 wsgi:app --workers 3 --timeout 600
The application convert all docx and doc file to txt. but
problem:
When i updated changes by sudo systemctl restart gunicorn.service and sudo systemctl restart nginx
In production it cannot convert docx and doc file to txt and error come up.
the application still give me error when i check gunicorn status
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: byte_string = self.extract(filename, **kwargs)
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: File "/home/ubuntu/web-server/env/lib/python3.7/site-packages/textract/parsers/doc_parser.py", line 9, in extract
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: stdout, stderr = self.run(['antiword', filename])
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: File "/home/ubuntu/web-server/env/lib/python3.7/site-packages/textract/parsers/utils.py", line 96, in run
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: ' '.join(args), 127, '', '',
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: textract.exceptions.ShellError: The command antiword /home/ubuntu/web-server/data/test_cvs/Yassin.docx failed because the executable
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: antiword is not installed on your system. Please make
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: sure the appropriate dependencies are installed before using
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: textract:
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: http://textract.readthedocs.org/en/latest/installation.html`
while i have installed antiword on Ubuntu i check by run which antiword the ouput is
ubuntu@:~/web-server/data/test_cvs$ which antiword
/usr/bin/antiword`
i also uninstalled and and reinstalled antiword but still the problem exist. i am stuck but it doesnt work in production but on port 8000 it work and i get output. why gunicorn cannot execute antiword? any help would be appreciated Thanks.
Can not execute antword In production by Gunicorn while in Development on same computer it work
i have install all dependences on Ubuntu before installing textract here is the link here
Reading package lists... Done Building dependency tree Reading state information... Done Note, selecting 'python-dev-is-python2' instead of 'python-dev' libjpeg-dev is already the newest version (8c-2ubuntu8). antiword is already the newest version (0.37-16). flac is already the newest version (1.3.3-1build1). lame is already the newest version (3.100-3). libmad0 is already the newest version (0.15.1b-10ubuntu1). libsox-fmt-mp3 is already the newest version (14.4.2+git20190427-2). pstotext is already the newest version (1.9-6build1). python-dev-is-python2 is already the newest version (2.7.17-4). sox is already the newest version (14.4.2+git20190427-2). swig is already the newest version (4.0.1-5build1). tesseract-ocr is already the newest version (4.1.1-2build2). unrtf is already the newest version (0.21.10-clean-1). libxml2-dev is already the newest version (2.9.10+dfsg-5ubuntu0.20.04.4). libxslt1-dev is already the newest version (1.1.34-4ubuntu0.20.04.1). poppler-utils is already the newest version (0.86.1-0ubuntu1.1). ffmpeg is already the newest version (7:4.2.7-0ubuntu0.1). 0 upgraded, 0 newly installed, 0 to remove and 44 not upgraded.
The following work is done on same server
problem:
When i updated changes by sudo systemctl restart gunicorn.service and sudo systemctl restart nginx
the application still give me error when i check gunicorn status
`● app.service - Gunicorn instance to serve myproject Loaded: loaded (/etc/systemd/system/app.service; enabled; vendor preset: enabled) Active: active (running) since Thu 2022-10-27 06:11:06 UTC; 5min ago Main PID: 389929 (gunicorn) Tasks: 13 (limit: 38087) Memory: 4.9G CGroup: /system.slice/app.service ├─389929 /home/ubuntu/web-server/env/bin/python /home/ubuntu/web-server/env/bin/gunicorn --workers 3 --bind unix:app.sock -m 007 wsgi:app --timeout 3600 ├─389992 /home/ubuntu/web-server/env/bin/python /home/ubuntu/web-server/env/bin/gunicorn --workers 3 --bind unix:app.sock -m 007 wsgi:app --timeout 3600 ├─389993 /home/ubuntu/web-server/env/bin/python /home/ubuntu/web-server/env/bin/gunicorn --workers 3 --bind unix:app.sock -m 007 wsgi:app --timeout 3600 └─389994 /home/ubuntu/web-server/env/bin/python /home/ubuntu/web-server/env/bin/gunicorn --workers 3 --bind unix:app.sock -m 007 wsgi:app --timeout 3600
Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: byte_string = self.extract(filename, **kwargs) Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: File "/home/ubuntu/web-server/env/lib/python3.7/site-packages/textract/parsers/doc_parser.py", line 9, in extract Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: stdout, stderr = self.run(['antiword', filename]) Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: File "/home/ubuntu/web-server/env/lib/python3.7/site-packages/textract/parsers/utils.py", line 96, in run Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: ' '.join(args), 127, '', '', Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: textract.exceptions.ShellError: The command
antiword /home/ubuntu/web-server/data/test_cvs/Yassin.docx
failed because the executable Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]:antiword
is not installed on your system. Please make Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: sure the appropriate dependencies are installed before using Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: textract: Oct 27 06:16:23 ip-172-31-77-202 gunicorn[389992]: http://textract.readthedocs.org/en/latest/installation.html`ubuntu@:~/web-server/data/test_cvs$
which antiword /usr/bin/antiword` i also uninstalled and and reinstalled antiword but still the problem exist. i am stuck but it doesnt work in production but on port 8000 it work and i get output. why gunicorn cannot execute antiword? any help would be appreciated Thanks.python version = 3.7 OS = Ubuntu 20.04