Closed jbrry closed 6 years ago
I think you should point this to the UDPipe repo. This repo is just a copy to allow to easily use the R package with these models. But your problem looks like the udpipe executable is not in your directory where you call train_all.sh Are you planning to build new UDPipe models? If you do and you plan to use R, you can also use the udpipe R package. Example training scripts for R are put at https://github.com/bnosac/udpipe.models.ud. If you plan to use the executable from UDPipe, just proceed as you are doing and try to fix your path problem.
Thank you very much for your response and please excuse me for opening an issue on this repo as it's not related to the R package!
I also think the same and I will try and fix this path problem. At the moment, I'm trying to run the UDPipe models as a baseline to compare parsing accuracy using POS tags generated by UDPipe vs POS tags using a different POS tagger. I was using the models in the reproducible training file as practice to see how the whole pipeline works. Thank you for the suggestion for using the R package! Right now, I have no problems running UDPipe models on a single language but I am trying to find a solution for running the UDPipe models on all languages and on the train/dev sets so the reproducible training file seemed like a good place to start because it includes the bash scripts! I should be able to see how the whole system works once I get the bash scripting/path issue figured out! Thanks
Which POS tagger are you comparing against? FYI. I recently compared UDPipe & spaCy here: https://github.com/jwijffels/udpipe-spacy-comparison
I was going to try the BiLSTM of Plank et al. (2016): https://github.com/bplank/bilstm-aux. I would like to try a system which uses token and character level representations, which might help improve accuracy for morphologically rich languages.
I noticed in the last CoNLL shared task on multilingual parsing, the Stanford team used their own POS tagger which had some improvements over the baseline system so I was looking at exploring other methods to increase parsing accuracy. Thanks very much for the comparison - it seems that the UDPipe system is more powerful than the spaCy tagger on pretty much all tests. Hopefully, I will have both the Plank and UDPipe models ran soon so I can get a clearer idea of the relative performance between the two but from what I gather both seem to perform very well.
Thank you for the information; Feel free to let me know if you have something to share on the comparison with BiLSTM.
Thanks very much for your help and certainly, will do!
Hi there,
I am trying to run the reproducible models 2.0. I have downloaded the udpipe-ud-2.0-170801-reproducible_training.zip file and in the README.TXT it states:
I have no problem running the get.sh script in step 1). However, when I try to run train_all.sh it outputs "Missing udpipe". If I view the train_all.sh script, the code is as follows:
I assume the problem is that the UDPipe executable is not being picked up by my $PATH? I have used UDPipe before and copied the executable to /usr/local/bin and have no problems running UDPipe models from anywhere in my filesystem so it seems unusual that it would not be able to pick up UDPipe. I have also since added the executable to my ~/.bashrc file and I still am not able to run this script.
Apologies if I should be directing this question on the UDPipe official GitHub repo, I just thought that maybe you would be able to point out where my problem could be coming from, as it seems like a small issue.
Many thanks, James