dmlc / gluon-nlp

NLP made easy
https://nlp.gluon.ai/
Apache License 2.0
2.55k stars 538 forks source link

[SCRIPT] update default arguments and add seed for scripts/generation/calculate_metrics.py #1514

Closed szha closed 3 years ago

szha commented 3 years ago

Description

remove num_samples arguments in generation calculate_metrics. the number of samples can be inferred from the sample file.

Checklist

Essentials

Changes

Comments

cc @dmlc/gluon-nlp-team

github-actions[bot] commented 3 years ago

The documentation website for preview: http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR1514/e8760c80a5a9779d42259cda048a6f17e484cc35/index.html

codecov[bot] commented 3 years ago

Codecov Report

Merging #1514 (d5ece78) into master (302865c) will decrease coverage by 0.64%. The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1514      +/-   ##
==========================================
- Coverage   86.53%   85.88%   -0.65%     
==========================================
  Files          55       55              
  Lines        7513     7396     -117     
==========================================
- Hits         6501     6352     -149     
- Misses       1012     1044      +32     
Impacted Files Coverage Δ
conftest.py 76.31% <0.00%> (-9.94%) :arrow_down:
src/gluonnlp/data/loading.py 75.75% <0.00%> (-7.64%) :arrow_down:
src/gluonnlp/utils/lazy_imports.py 58.42% <0.00%> (-2.25%) :arrow_down:
src/gluonnlp/utils/misc.py 54.86% <0.00%> (-1.50%) :arrow_down:
src/gluonnlp/data/tokenizers/spacy.py 65.33% <0.00%> (-0.91%) :arrow_down:
src/gluonnlp/data/tokenizers/huggingface.py 71.34% <0.00%> (-0.49%) :arrow_down:
src/gluonnlp/data/tokenizers/jieba.py 73.13% <0.00%> (-0.40%) :arrow_down:
src/gluonnlp/models/transformer_xl.py 80.48% <0.00%> (-0.39%) :arrow_down:
src/gluonnlp/models/xlmr.py 87.87% <0.00%> (-0.36%) :arrow_down:
src/gluonnlp/models/electra.py 76.50% <0.00%> (-0.34%) :arrow_down:
... and 17 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 302865c...af455d6. Read the comment docs.

sxjscience commented 3 years ago

I think it's added because a straight-forward implementation of self-bleu has O(N^2) complexity. It will be too slow if we calculate it among lots of samples.

szha commented 3 years ago

@sxjscience thanks for the comment. I suspected that performance was the reason. Still, I think it might be confusing to randomly subsample reference sets which results in variance in the scores from different runs. Should we switch to deterministic samples instead?

sxjscience commented 3 years ago

We may fix the seed then.

github-actions[bot] commented 3 years ago

The documentation website for preview: http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR1514/d5ece78380c899b7935baf1172152ae6f1b5e4b5/index.html

github-actions[bot] commented 3 years ago

The documentation website for preview: http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR1514/af455d65ef0a3945305600e3596742f35e07db6d/index.html