Closed jwijffels closed 5 years ago
Hi @jwijffels For multi-task learning, please refer to https://github.com/facebookresearch/StarSpace/issues/131. For ensemble models, we do not have the code available here. In our paper we did the following: train 10 models of dimension 300 and concat the embeddings to get a model with dimension 3000.
@ledw I saw #131 but that did not really explain the logic of the wordWeight. E.g. how does it train both for example on sentences and word embeddings? About ensemble models, yes that is what I understood. I just wondered how you concatenated the embeddings using this framework and if after the concatenation, you still let it train further?
@jwijffels apologize for the delay in replying. The logic around wordWeight is the following: it uses wordWeight to be the multiplier of word embeddings in the loss function. In other words, it adds wordWeight * (loss on words) in addition of loss on sentences. The loss on words is calculated from predicting the current word from words nearby. For the ensemble models, we do not carry on training after concatenation.
Ok, that's more clear. Thank you for the clarification on the wordWeight!
I'm trying out different Starspace models on some data which I have locally and I'm interested to see if you have examples showing where you applied