Closed sld closed 5 years ago
Hi! Thanks for kind words and sorry for the late reply... Yes, these spaces are some strange artifacts; thank you for noticing, I removed them. As for the analysis of "спасибо", this is just a disadvantage of the udpipe model. I think it tags most nouns starting from the capital letter as PROPN.
Hi! Thank you for the great tool!
I have found some strange PROPN usage in preprocessing. In https://github.com/akutuzov/webvectors/blob/master/preprocessing/rus_preprocessing_udpipe.py#L182 and below it has additional space character in the end. Maybe it shouldn't have additional space and look like
+ '_PROPN'
, not+ '_PROPN '
?Also udpipe returns strange result for word "Спасибо":