Open giacbrd opened 7 years ago
@giacbrd I would love to do this. We need to reproduce the results for ShallowLearn, right? (for all the features that it supports as of now)
Hi, I was already writing a tutorial notebook for ShallowLearn (using Wikipedia data), something to post around the web. Now that there is a "official" tutorial I would like to change mine in order to use their data and comparing the results. So, thank you but I think to do this the next days!
@prakhar2b I am still working on this and I found some inconsistencies between the cython and python version of the algorithm. The cython version produces strange outputs sometimes, so I am working on fixing this. In short, the cython code is unstable!
@giacbrd Thanks for the update. Is there any inconsistency in the Cython file in Labeled w2v (PR#1153 in gensim) as well ?
Working on labeled w2v is in the pipeline for July as a part of my Google summer of code project with Gensim. If cython code is unstable, I will have to write cython code from scratch for my gsoc project, I would appreciate if you could guide me there too. Thanks :smile:
Yes there is a bug with the "softmax" loss function, the code in the PR is the same. I am going to fix this ASAP.
For "unstable" I meant "there's a bug"! The cython code is mostly iterations over arrays, it should be easy to refactor, but maybe there is not so much to improve in terms of speed.
Thanks for the acknowledgment!
Just curious - has this bug been fixed? What impact would it have?
Hi, the method by softmax outputs was not working properly. Actually, it was a problem due to parameters configuration. I chenged some stuff and the develop branch is updated (see https://github.com/giacbrd/ShallowLearn/compare/develop#diff-8e5218dd47d140f1c094b54c6f9d1290). I have still not released the fix
a fastText tutorial has been published: https://github.com/facebookresearch/fastText/blob/master/tutorials/supervised-learning.md
Do a Jupyter Notebook!