commonsense / conceptnet-numberbatch

Other
1.3k stars 143 forks source link

Accuracy issues #51

Closed usmaann closed 6 years ago

usmaann commented 6 years ago

Hi

I tried ConceptNet Numberbatch pre trained embedding on CNN classification task and compared the results Glove and Word2vec results.

Results of Word2vec and Glove are still better than ConceptNet embedding. I was expecting to get better accuracy results from numberbatch,

Any advise ? if i am doing anything wrong?

rspeer commented 6 years ago

Well, I can't guarantee our results will always be better, but a couple of things to check:

Also, recent results I've seen where common sense background knowledge has performed well has used it actively as part of the training process, not just in pre-training (where you'd hope that your function would find a local minimum for your training data regardless of where it started).

usmaann commented 6 years ago

Hi thanks for your quick feedback.

I have also tried to use ConceptNet Numberbatch in the training process as well but the results are almost the same as Glove and Word2vec.

Am I having a wrong expectation? My expectation is that word embedding with a combination of Knowledgebase ( Concept Net Numberbatch) should give better accuracy results instead of using word embedding alone like Glove or Word2vec?

rspeer commented 6 years ago

There are definitely published results indicating that the information in ConceptNet provides the best results on some tasks. Recent ones include:

However: distributional models have gotten better and more sophisticated since Numberbatch. It is probably no longer as simple as dropping in Numberbatch as a replacement, especially if the vocabulary we distribute is wrong for your task.

It will probably require more sophistication to incorporate ConceptNet into current models as well. Sorry to hear that it's not helping for your task so far.

(I hope it's okay to respond just here, and not also on the mailing list.)

usmaann commented 6 years ago

Hi, Thanks again for your detailed feedback.

Any possible way you advise to do to get maximum benefit of NumberBatch( conceptNet)

Currently I am using numberbatch as a pre-trained embedding and on top of that you CNN model for sentence embedding and then softmax > calculate accuracy