nasa-petal / PeTaL-labeller

The PeTaL labeler labels journal articles with biomimicry functions.
https://petal-labeller.readthedocs.io/en/latest/
The Unlicense
6 stars 3 forks source link

Does adding MeSH terms and/or MAG fields of study improve accuracy? #53

Closed bruffridge closed 3 years ago

bruffridge commented 3 years ago

See this issue for how to add these metadata fields to the labeller.

https://github.com/yuzhimanhua/MATCH/issues/3#issuecomment-858646656

Try adding Mesh and MAG separately then together, and compare results to accuracy without these metadata elements.

elkong commented 3 years ago

Here are the results from my trials:

Test set P@1 without MeSH with MeSH
without MAG 0.64 0.63
with MAG 0.61 0.67

Results are a bit inconclusive, and I suspect the differences between these trials may not be statistically significant, but adding MeSH terms and MAG fields does not hurt accuracy, nor does it hurt performance (i.e., speed). All trials took roughly 12 minutes to run 1000 epochs on the dataset of 1000 papers (800 training, 100 validation, 100 test). Additionally, here were the final validation set precisions (they tended to peak several hundred epochs in, and then fell at the end).

Validation set P@1 without MeSH with MeSH
without MAG 0.63 0.57
with MAG 0.61 0.61
pjuangph commented 3 years ago

Eric, can you include a link to your commit? Just paste the sha256.

elkong commented 3 years ago

New results on this: see discussion in #68