question on example on multi-task learning and ensembles

facebookresearch / StarSpace

Learning embeddings for classification, retrieval and ranking.

MIT License

3.94k stars 531 forks source link

question on example on multi-task learning and ensembles #217

Closed jwijffels closed 5 years ago

jwijffels commented 5 years ago

I'm trying out different Starspace models on some data which I have locally and I'm interested to see if you have examples showing where you applied

multi-task learning
ensembles of starspace models It's mentioned in the paper but there is not a lot of documentation on the logic of multi-task learning + the ensemble model building. Do you have more information on these 2?

ledw commented 5 years ago

Hi @jwijffels For multi-task learning, please refer to https://github.com/facebookresearch/StarSpace/issues/131. For ensemble models, we do not have the code available here. In our paper we did the following: train 10 models of dimension 300 and concat the embeddings to get a model with dimension 3000.

jwijffels commented 5 years ago

@ledw I saw #131 but that did not really explain the logic of the wordWeight. E.g. how does it train both for example on sentences and word embeddings? About ensemble models, yes that is what I understood. I just wondered how you concatenated the embeddings using this framework and if after the concatenation, you still let it train further?

ledw commented 5 years ago

@jwijffels apologize for the delay in replying. The logic around wordWeight is the following: it uses wordWeight to be the multiplier of word embeddings in the loss function. In other words, it adds wordWeight * (loss on words) in addition of loss on sentences. The loss on words is calculated from predicting the current word from words nearby. For the ensemble models, we do not carry on training after concatenation.

jwijffels commented 5 years ago

Ok, that's more clear. Thank you for the clarification on the wordWeight!