Hi, I'm trying to use GloVe as a classifier in a semi-supervised way, at the same time, these changes make the model could learn the sub-word level information.
I'm not good with C, these code can work but it was ugly, It is wonderful if you could make it beautiful.
By default, following line is the current supervised classifier data format:
__label__someclass there is a line of data of someclass
and following line is the current sub-word information learning data format:
__combine__ manysubwordunits ___cinfo_many ___cinfo_sub ___cinfo_word ___cinfo_units
where the ___cinfo_ was not actually needed, just to distinguish the words and the sub-words.
Current code seems messy, it's better if these changes could become more neat and merge into cooccur.c.
Hi, I'm trying to use
GloVe
as a classifier in a semi-supervised way, at the same time, these changes make the model could learn the sub-word level information. I'm not good withC
, these code can work but it was ugly, It is wonderful if you could make it beautiful.By default, following line is the current supervised classifier data format:
__label__someclass there is a line of data of someclass
and following line is the current sub-word information learning data format:__combine__ manysubwordunits ___cinfo_many ___cinfo_sub ___cinfo_word ___cinfo_units
where the___cinfo_
was not actually needed, just to distinguish the words and the sub-words.Current code seems messy, it's better if these changes could become more neat and merge into
cooccur.c
.