Closed Spaskich closed 2 years ago
Hi @Spaskich
Thank you for letting us know about this issue. We are going to start training 3.0 models for these languages. Greek, Slovak and Russian should already be supported in 3.0.
The missing languages are:
They will be trained 1 by 1 and I will update this issue as soon as we publish the models.
Regarding 1.1-3.0 compatibility - 3.0 version of NLPCube is incompatible with 1.1. Especially because we changed the underlaying ML framework for lack of support.
A quick update: czech and finish should be uploaded and working
Hindi is also finished and I just pushed it to the model repository
Indonesian is finished and uploaded.
Portuguese is pushed.
Slovenian is pushed.
Sweedish is pushed.
@Spaskich - I've just pushed the final language (Turkish). Let me know if you have any issues with the 3.0 models. At the first glance, there should be a huge boost in accuracy for the newly added models.
If everything is ok, give me the green light to close the issue.
I tested all the languages. Everything works well, except for Czech, it can't find the model. Is the language code the same - cs
?
On an unrelated note, is the pip repository updated with the newest version? When I try to run the example code from the readme, I get the following error:
Traceback (most recent call last):
File ".../main.py", line 1, in <module>
from cube.api import Cube # import the Cube object
File "...\Python\Python37-32\lib\site-packages\cube\__init__.py", line 1, in <module>
from api import *
ModuleNotFoundError: No module named 'api'
I've tried replacing from api
with from cube.api
in the imported library, but then I get errors for multiple missing packages: requests, urllib2, StringIO
, to name a few.
Yes, it should have the same name (cs
). Maybe you need to clear the cache: rm -rf ~/.nlpcube/3.0/cs*
.
Yes, the pip package has the latest version. How did you test the other languages if you are getting that error? Was it a local installation?
I tested them by modifying and running the whole project.
Czech had a packaging issue. It is now fixed and pushed, but you will have to clear the cache: rm -rf ~/.nlpcube/3.0/cs*
Regarding the other issue, I don't know what is happening. Maybe there is a package confusion in your local environment. I just tried running NLPCube from scratch in a Google Collaboratory. It worked without issues. This is the link: https://colab.research.google.com/drive/16774lm4UcW_30REm0_60CXFshn8BH4L3?usp=sharing
Okay, it's probably something on my end then, I'll look into it. Thanks for the help and the quick response with the models!
No problem. Glad to help.
I just noticed that CS has some issues with compound words. I will have to retrain the tokenizer. Sorry for this.
Done. Model is pushed. There is also a package update.
Great! Thanks a lot for the help!
Is your feature request related to a problem? Please describe. I've been using the previous version of NLP-Cube for a wide array of languages, most of which are not present in 3.0.
Describe the solution you'd like Updated language models for Czech, Finnish, Greek, Hindi, Indonesian, Portuguese, Russian, Slovak, Slovenian, Swedish, and Turkish.
Describe alternatives you've considered Can the NLP-Cube 3.0 version work with the older v1.1 models? If so, do you plan on dropping support for them in the future?