Closed pboot closed 4 years ago
I have similar problems.
Fresh Ubuntu server 18.04 latest LaMachine docker image
$ lamachine-update
$ lamachine-add alpino
$ lamachine-update
$ lamachine-add tscan
$ lamachine-update
$ lamachine-start-webserver
/alpino and /tscan both give me 502 Bad Gateway
.
I've included my logfiles: uwsgi emperor.log nginx error.log tscan.uwsgi.log alpino.uwsgi.log
Alpino complains that ALPINO_HOME is not set.
export ALPINO_HOME=/usr/local/opt/Alpino'
doesn't fix the problem.
Adding the following else
in /usr/local/lib/python3.7/dist-packages/clamservices/config/alpino.py
results in a working Alpino, but that's not a real solution of course.
if 'ALPINO_HOME' in os.environ:
ALPINO_HOME = os.environ['ALPINO_HOME']
else:
ALPINO_HOME = '/usr/local/opt/Alpino'
Like with the topic starter, tscan complains about missing tscan.wsgi.
Hope this helps. If there's anything I can do to help/test, don't hesitate.
Update: trying alpino results in the following errors:
[CLAM Dispatcher] Adding to PYTHONPATH: /usr/local/lib/python3.7/dist-packages/clamservices/config [CLAM Dispatcher] Started CLAM Dispatcher v3.0.9 with clamservices.config.alpino (2020-01-23 15:04:17) [CLAM Dispatcher] Running python3 "/usr/local/lib/python3.7/dist-packages/clamservices/wrappers/alpino_wrapper.py" "/usr/local/var/www-data/alpino.clam/projects/anonymous/th1/clam.xml" "/usr/local/var/www-data/alpino.clam/projects/anonymous/th1/.status" "/usr/local/var/www-data/alpino.clam/projects/anonymous/th1/output/" "/usr/local/opt/Alpino" [CLAM Dispatcher] Running with pid 3157 (2020-01-23 15:04:17) ucto: inputfile = input/th1_tekst1_tscan.txt ucto: outputfile = ucto: textcat configured from: /usr/local/share/ucto/textcat.cfg ucto: configured for languages: [nld] /usr/local/opt/Alpino/create_bin/Alpino.bin: error while loading shared libraries: libboost_system.so.1.58.0: cannot open shared object file: No such file or directory ALPINO_HOME=/usr/local/opt/Alpino /usr/local/opt/Alpino/bin/Alpino -veryfast -flag treebank xml debug=1 end_hook=xml user_max=900000 -parse < /usr/local/var/www-data/alpino.clam/projects/anonymous/th1/output/th1_tekst1_tscan.tok Failure running alpino [CLAM Dispatcher] Process ended (2020-01-23 15:04:17, 0.712237s) [CLAM Dispatcher] Removing temporary files [CLAM Dispatcher] Status code out of range (512), setting to 127 [CLAM Dispatcher] Finished (2020-01-23 15:04:17), exit code 127, dispatcher wait time 0.7000000000000001s, duration 0.712651s
libboost-system version mismatch maybe? It looks like 1.67.0-13 is installed.
Thanks for the elaborate feedback! Something indeed goes wrong there. It's also a combination of Alpino being compiled against some specific shared libraries rather than the ones provided by the distribution, though I solved part of this issue already, it seems to surface for the webservices still. I'm working on a fix.
Great! Looking forward to it. If you need me to test anything, let me know.
I think part of the problem is simply that after the add/update sequence, you are still in the old shell, which explains why $ALPINO_HOME wasn't set yet. It needs to be refreshed (in a VM that would be simply disconnecting and reconnecting, in the local variant it would be activating the environment again, in the container you can also re-enter but you have to commit the changes to a new image first). In all cases, explicitly starting bash again after the add/update sequence will probably solve it and pick up on the changes (i.e. resets all the environment variables needed).
Update: I'm fixing this so lamachine-update takes care of it automatically
As for the tscan specific problem @pboot initially addressed (the missing wsgi file), it seems the webservice integration never got completed fully, I'm fixing it now (with one caveat: the background services for tscan still need to be started explicitly for the time being, lamachine-start-webserver
doesn't take those into account yet, it will warn about this itself).
Should be solved now in the latest LaMachine (please reopen if problems persist)
Tried updating LaMachine, but I get this error: lamachine-lamachineJ-20200127_092001.log
The message says you have local modifications in $LM_PREFIX/src/tscan
(perhaps you edited a script there to test something?), so it refuses to overwrite that as a precaution, if you run update with force=1, then it should work.
Hi, I started with a clean container and added alpino and tscan:
$ lamachine-update
$ lamachine-add alpino
$ lamachine-update
$ lamachine-add tscan
$ lamachine-update
$ lamachine-start-webserver
Subsequently, I started the following services (as suggested, taken from https://github.com/proycon/tscan#usage. Added the '&' to some services, because they seemed to stay in the foreground)
$ /usr/local/src/tscan/webservice/startalpino.sh
$ /usr/local/src/tscan/webservice/startfrog.sh &
$ /usr/local/src/tscan/webservice/startwopr20.sh &
$ /usr/local/src/tscan/webservice/startwopr02.sh &
/tscan works, but trying a scan with th1_tekst1_tscan.txt (taken from nu.nl) results in the error log below.
Not sure if this is a new issue or not, so posted it here for now.
Thanks :)
[CLAM Dispatcher] Adding to PYTHONPATH: /usr/local/lib/python3.7/dist-packages/tscanservice [CLAM Dispatcher] Started CLAM Dispatcher v3.0.10 with tscanservice.tscan (2020-01-27 12:44:03) [CLAM Dispatcher] Running /usr/local/src/tscan/webservice/tscanservice/tscanwrapper.py "/usr/local/var/www-data/tscan.clam/projects/anonymous/th1/clam.xml" "/usr/local/var/www-data/tscan.clam/projects/anonymous/th1/.status" "/usr/local/var/www-data/tscan.clam/projects/anonymous/th1/input/" "/usr/local/var/www-data/tscan.clam/projects/anonymous/th1/output/" "/usr/local/src/tscan" "/usr/local/opt/Alpino" [CLAM Dispatcher] Running with pid 19290 (2020-01-27 12:44:03) TScan 0.9.6 working dir /tmp/tscan-19293/ couldn't open file: /usr/local/src/tscan/data/SoNaR500.wordfreqlist20000.freq mv: cannot stat '/usr/local/var/www-data/tscan.clam/projects/anonymous/th1/input//.csv': No such file or directory cat: '/usr/local/var/www-data/tscan.clam/projects/anonymous/th1/output//.words.csv': No such file or directory cat: '/usr/local/var/www-data/tscan.clam/projects/anonymous/th1/output//.paragraphs.csv': No such file or directory cat: '/usr/local/var/www-data/tscan.clam/projects/anonymous/th1/output//.sentences.csv': No such file or directory cat: '/usr/local/var/www-data/tscan.clam/projects/anonymous/th1/output//*.document.csv': No such file or directory Expected output file input/th1_tekst1_tscan.txt.tscan.xml not created, something went wrong earlier? [CLAM Dispatcher] Process ended (2020-01-27 12:44:12, 8.512604s) [CLAM Dispatcher] Removing temporary files [CLAM Dispatcher] Finished (2020-01-27 12:44:12), exit code 0, dispatcher wait time 8.5s, duration 8.513251s
Update: Did two things that got it working:
/usr/local/src/tscan/data/SoNaR500.wordfreqlist20000.freq
and /usr/local/src/tscan/data/subtlex_words20000.freq
were not world-readable, so: chmod +r
Ah! Well done! I'm experimenting with this too currently.. it indeed seems a new more tscan specific issue, tscan is still a bit messy to get to run unfortunately. It also seems that some of start scripts need to be started from current working directory /usr/local/src/tscan/webservice/
or else the relative paths don't work.
- /usr/local/src/tscan/data/SoNaR500.wordfreqlist20000.freq and /usr/local/src/tscan /data/subtlex_words20000.freq were not world-readable, so: chmod +r
Ok, if this is something that can not be fixed on the tscan side (wrong permissions in the data tarball download?) then I'll do some extra postprocessing on the LaMachine side to get this right.
It also seems that some of start scripts need to be started from current working directory
/usr/local/src/tscan/webservice/
or else the relative paths don't work.
Ah, I think that was one of my problems too then.
- /usr/local/src/tscan/data/SoNaR500.wordfreqlist20000.freq and /usr/local/src/tscan /data/subtlex_words20000.freq were not world-readable, so: chmod +r
Ok, if this is something that can not be fixed on the tscan side (wrong permissions in the data tarball download?) then I'll do some extra postprocessing on the LaMachine side to get this right.
Excellent!
Ok, this is something to be handled upstream by the tscan maintainers, see proycon/tscan#16
Another update, might be useful for other users:
Wopr consumes 5-6GB per service, so that's why it crapped out on my 16GB machine (running the LaMachine docker container). I've shut them down and instructed users to perform t-scans without Wopr (which in this case provides sufficient data, YMMV of course).
Hi @tjeerdhans, thanks for the feedback. Wopr indeed takes up a lot of memory; running T-Scan locally without it is recommendable, you only lose a small number of output fields. I will fix the permission issue, this will, unfortunately, take some time, as I am not able to access the data server anymore...
A fresh Ubuntu 18.04
Install LaMachine (2.9.0) in default config. Installations ends correctly, can start webserver and run e.g. Frog from portal. lamachine-LMPB-install2019-07-01.log
Add Alpino (lamachine add alpino, lamachine-update). Update ends correctly, can start webserver and run Alpino from portal lamachine-LMPB-20190701_114950.log
Add tscan (lamachine add tscan, lamachine-update). Update ends correctly, can start webserver, entry for T-scan (http://127.0.0.1/tscan) visible in portal. lamachine-LMPB-20190701_121444.log However, on activating T-scan what results is a 404 (first time) or 502 error (after restart of the PC).
In the update log I notice lots of deprecation errors when installing T-scan (but these were also present for Alpino): [DEPRECATION WARNING]: evaluating webserver as a bare variable, this behaviour will go away and you might need to add |bool to the expression in the future. Also see CONDITIONAL_BARE_VARS configuration toggle.. This feature will be removed in version 2.12
In the uWSGI log I encounter repeatedly:
uwsgi.log
In the tscan.uwsgi.lo I encounter, also repeatedly:
tscan.uwsgi.log
File indeed doesn't exist at the specified location, unlike other *.wsgi files.
Again, no particular urgency.