dmlc / gluon-nlp

NLP made easy
https://nlp.gluon.ai/
Apache License 2.0
2.55k stars 538 forks source link

[FEATURE]Horovod support for training transformer (PART 2) #1301

Closed hutao965 closed 4 years ago

hutao965 commented 4 years ago

Checklist

Essentials

Changes

codecov[bot] commented 4 years ago

Codecov Report

Merging #1301 into master will increase coverage by 0.31%. The diff coverage is 97.14%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1301      +/-   ##
==========================================
+ Coverage   84.14%   84.45%   +0.31%     
==========================================
  Files          42       42              
  Lines        6397     6422      +25     
==========================================
+ Hits         5383     5424      +41     
+ Misses       1014      998      -16     
Impacted Files Coverage Δ
src/gluonnlp/data/sampler.py 96.55% <97.14%> (+0.32%) :arrow_up:
src/gluonnlp/utils/misc.py 52.53% <0.00%> (+0.63%) :arrow_up:
src/gluonnlp/data/loading.py 83.39% <0.00%> (+5.28%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 32e87d4...db8afdb. Read the comment docs.

sxjscience commented 4 years ago

For the diff check failure, it might be caused by the fact that the exception is not triggered: https://github.com/dmlc/gluon-nlp/blob/32e87d4d4aa20a6eb658ee90d765ccffbd160571/src/gluonnlp/utils/misc.py#L562-L568

Thus, we may try to add a test-case which downloads an non-existing file and use pytest.assertRaises.

sxjscience commented 4 years ago

I find that the codecov is not very stable, especially the diff hit. @szha @leezu Would you know how could we improve it?

szha commented 4 years ago

assuming no race issue in result upload, this can be a result of non-determinism in tests, and the code path in the same test can vary.

if that's the case, we can mock the randomness to deterministically test each path. or if we still require the randomness, we can increase the trial number