Closed argestes closed 6 years ago
Hi,
This is absolutely possible. You can download pre-trained Turkish vectors from Facebook (who trained their fastText vectors on Turkish Wikipedia) here: https://s3-us-west-1.amazonaws.com/fasttext-vectors/word-vectors-v2/cc.tr.300.vec.gz
Then, un-zip the .gz file to a .vec file. Then you can convert them to Magnitude using the instructions found here: File Format and Converter.
If you want to train your own Turkish models you can use the tutorial found here for Gensim and then convert that resulting file to Magnitude as well.
Since Turkish appears to use an alphabet, the out-of-vocabulary lookups should still work in Turkish.
The instructions for using Magnitude with other languages is now documented in the README.
Hello. First of all thanks for your effort. This is a pretty impressive library.
I'm not very experienced on nlp but I'm currently working on a sort of nlp task which involves classifying some text messages without having labeled data. Project I'm working on needs to process Turkish sentences. Can I somehow use this library to train on Turkish documents? If so can you provide me an example or guide me on the process? Thanks.