Closed matgrioni closed 7 years ago
Thanks for your report, I am not sure where the error comes from because the file you attached is not tokenized. RDRPOSTagger requires an input tokenized/word-segmented file. Best, Dat.
Thank you for responding. I will try to tokenize the file as shown in /data
as I had not noted this before in the requirements. I will close and re-open if the issue persists after that.
I'm getting the same error:
=> Read a POS tagging model from /home/flavio/Documenti/POS/RDRPOSTagger/Models/UniPOS/UD_Latin-ITTB23/la_ittb23-upos.RDR
=> Read a lexicon from /home/flavio/Documenti/POS/RDRPOSTagger/Models/UniPOS/UD_Latin-ITTB23/la_ittb23-upos.DICT
=> Perform POS tagging on /home/flavio/Documenti/POS/Testi_Tabelle/De_divinatione/Cic_DeDiv_SentWord_Tokenized_corretto_detersum_orizzontale.txt
ERROR ==> "''"
Probably there is an error in the file I used for training, since other models have no problem on the same file. But I can not identify it, since it seems to follow all requirements.
For training: latin_ittb-ud23_train_orizzontale.txt
To tag: Cic_DeDiv_SentWord_Tokenized_corretto_detersum_orizzontale.txt
You can either:
1) Fix this error by simply adding: '' PUNCT
as a new line in the la_ittb23-upos.DICT file.
2) Or use the latest RDRPOSTagger which I have just updated. It is just a minor update on file InitialTagger.py
to handle this error, so you do not need to retrain any model.
Now it works, thank you!
I using the following command within
RDRPOSTagger/pSCRDRtagger
For some of the files I run it on it works as expected. For others, such as the one attached there is an error output as follows:
I'm not sure where this blank error is coming from as it is blank. This problem does not occur for the java implementation however, so:
works for the same file.
Alexander_Severus.txt