nasa-petal / PeTaL-labeller

The PeTaL labeler labels journal articles with biomimicry functions.
https://petal-labeller.readthedocs.io/en/latest/
The Unlicense
6 stars 3 forks source link

rerun ablation study #68

Closed bruffridge closed 3 years ago

bruffridge commented 3 years ago

Eric, can you rerun this ablation study, now that the MAG/MeSH bug is fixed? I want to see if we can just use MAG terms, or if we need MeSH as well. Using just MAG terms makes data collection easier.

<!DOCTYPE html>

Train set options P@1=nDCG@1 P@3 P@5 nDCG@3 nDCG@5
with_mag, with_mesh 0.590 ± 0.040 0.457 ± 0.030 0.369 ± 0.025 0.495 ± 0.032 0.493 ± 0.035
with_mag, no_mesh 0.583 ± 0.032 0.477 ± 0.035 0.378 ± 0.029 0.508 ± 0.033 0.506 ± 0.036
no_mag, with_mesh 0.573 ± 0.056 0.455 ± 0.029 0.362 ± 0.034 0.488 ± 0.034 0.485 ± 0.040
no_mag, no_mesh 0.569 ± 0.036 0.475 ± 0.028 0.373 ± 0.026 0.504 ± 0.029 0.498 ± 0.030
elkong commented 3 years ago

I've just added a commit with my results for this issue. The results themselves are copied here as follows:

MAG MeSH Everything Else P@1=nDCG@1 P@3 P@5 nDCG@3 nDCG@5
no no no 0.307 ± 0.047 0.221 ± 0.036 0.193 ± 0.012 0.239 ± 0.038 0.250 ± 0.026
yes no no 0.498 ± 0.034 0.393 ± 0.024 0.312 ± 0.029 0.420 ± 0.026 0.415 ± 0.028
no yes no 0.409 ± 0.067 0.313 ± 0.046 0.264 ± 0.029 0.338 ± 0.050 0.348 ± 0.043
yes yes no 0.533 ± 0.065 0.432 ± 0.045 0.345 ± 0.040 0.461 ± 0.044 0.455 ± 0.046
no no yes 0.582 ± 0.064 0.450 ± 0.047 0.343 ± 0.042 0.486 ± 0.048 0.471 ± 0.055
yes no yes 0.586 ± 0.104 0.443 ± 0.059 0.350 ± 0.045 0.482 ± 0.069 0.475 ± 0.063
no yes yes 0.571 ± 0.087 0.439 ± 0.064 0.344 ± 0.043 0.477 ± 0.069 0.468 ± 0.063
yes yes yes 0.591 ± 0.043 0.452 ± 0.036 0.359 ± 0.027 0.492 ± 0.036 0.487 ± 0.033

In a nutshell, this suggests that MAG fields of study alone give somewhat more information than MeSH terms alone, although both contribute positively to improving precision/nDCG.

Also note that the standard deviation's rather high (perhaps due to the size of our dataset) -- sometimes for the same config settings (but different folds of the dataset) I get test precisions ranging from 0.44 to 0.72