nasa-petal / PeTaL-labeller

The PeTaL labeler labels journal articles with biomimicry functions.
https://petal-labeller.readthedocs.io/en/latest/
The Unlicense
6 stars 3 forks source link

Run Match on just level 1 labels #73

Closed bruffridge closed 3 years ago

bruffridge commented 3 years ago

Trim off labels with fewer than 10 examples. Make sure the difference between the label with the fewest and label with most is not more than 100 times.

elkong commented 3 years ago

(also mentioned here in the README)

In the PeTaL taxonomy there are ten Level 1 labels. Their frequencies of occurrence in golden.json are plotted in the following graph.

Level 1 Labels and Their Frequency of Occurrence in golden.json

The following are the precisions and nDCG scores of MATCH on only the level 1 labels. P@3, P@5, nDCG@3, and nDCG@5 are largely without meaning, because most papers have only one level l label. The performance is roughly on par with MATCH performance on the entire tree of labels; I am not sure why it is not higher.

Train set options P@1=nDCG@1 P@3 P@5 nDCG@3 nDCG@5
level1 0.621 ± 0.032 0.339 ± 0.025 0.239 ± 0.015 0.684 ± 0.030 0.732 ± 0.027

A multilabel confusion matrix showing what each label tends to be classified as is shown below:

MCM for Level 1 Labels in golden.json

pjuangph commented 3 years ago

what is P@1 mean?

pjuangph commented 3 years ago

Can you also described the stuff we talked about, how the authors of match are not training from pre-trained bert? What other limitations do you see? these results are slightly better than the ones from hugging face. It would be nice to see a confusion matrix of the hugging face models (old labeller) with golden.json for level 1 data

elkong commented 3 years ago

@pjuangph Ooh yes, sorry, P@1 is "precision at top 1", or, "how often is MATCH's top prediction a correct one?".

And yes, I absolutely do intend to get to everything we talked about sometime -- I'll see if I can get the old labeller model to interface with golden.json. For now I'll add a note in the README saying that that's on my list of things to do. I intend to clean up my MATCH work and run a few sanity checks with it just to make sure I'm not missing anything important before I leave it for other exciting research avenues like the ones we discussed