UDPIPE not working - Githubissues

clarin-eric / ParlaMint

ParlaMint: Comparable Parliamentary Corpora

https://clarin-eric.github.io/ParlaMint/

50 stars 53 forks source link

UDPIPE not working #589

Closed rjzevallos closed 1 year ago

rjzevallos commented 1 year ago

Someone uses UDPIPE, the service is down. Y_Y

link service: http://lindat.mff.cuni.cz/services/udpipe/api/process

matyaskopp commented 1 year ago

I don't see any trouble. It works. You can always check the web frontend: https://lindat.mff.cuni.cz/services/udpipe/

If too many users use it, it can be overloaded, and the processing is done on CPUs which are much slower than GPUs.

matyaskopp commented 1 year ago

You can also use my perl script for annotations if you don't want to implement it yourself: https://github.com/ufal/ParCzech/tree/master/src/udpipe2

rjzevallos commented 1 year ago

When I run

'perl -I lib udpipe2/udpipe2.pl'

I get:

Can't locate XML/LibXML/PrettyPrint.pm in @INC (you may need to install the XML::LibXML::PrettyPrint module) (@INC contains: lib /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.30.0 /usr/local/share/perl/5.30.0 /usr/lib/x86_64-linux-gnu/perl5/5.30 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.30 /usr/share/perl/5.30 /usr/local/lib/site_perl /usr/lib/x86_64-linux-gnu/perl-base) at lib/ParCzech/PipeLine/FileManager.pm line 405. BEGIN failed--compilation aborted at lib/ParCzech/PipeLine/FileManager.pm line 405. Compilation failed in require at udpipe2/udpipe2.pl line 16. BEGIN failed--compilation aborted at udpipe2/udpipe2.pl line 16.

matyaskopp commented 1 year ago

Yes, because you don't have installed dependencies.

cpan XML::LibXML::PrettyPrint

Then try to run again and install the next missing dependency.

For XML:LibXML you will probably need to install some system dependency (in Ubuntu-like systems libxml2-dev)

rjzevallos commented 1 year ago

Done! Thank you so much. I would like get this using your code:

curl -F input=vertical -F data=@tokens_vertical.txt -F model=catalan-ancora-ud-2.10-220711 -F parser= http://lindat.mff.cuni.cz/services/udpipe/api/process > output_ud.json"

I was trying put this but is not the same:

perl -I lib udpipe2/udpipe2.pl --colon2underscore \ --model "catalan-ancora-ud-2.10-220711" \ --elements "seg" \ --debug \ --try2continue-on-error \ --filelist list_of_filenames2process.fl \ --input-dir inDir \ --output-dir outDir

matyaskopp commented 1 year ago

My code annotates TEI files, so the input is TEI files, and the output is annotated TEI files.

BTW, why are we discussing this? We already have an ES-CT sample: https://github.com/clarin-eric/ParlaMint/tree/data/Data/ParlaMint-ES-CT so I guess you have a working pipeline...

TomazErjavec commented 1 year ago

My code annotates TEI files, so the input is TEI files, and the output is annotated TEI files.

But in a very general sense it would be great if anybody (for a certain value of anybody) could make a TEI file and produce with this script a .TEI.ana without having to have their own pipeline. It would take "only" writing a section in contributing?

TomazErjavec commented 1 year ago

My suggestion probably won't happen, so I am closing this issue. If anyone disagrees, pls. reopen.