Closed sxjscience closed 3 years ago
The documentation website for preview: http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR1399/improve_test/index.html
Merging #1399 into master will increase coverage by
2.96%
. The diff coverage isn/a
.
@@ Coverage Diff @@
## master #1399 +/- ##
==========================================
+ Coverage 82.17% 85.14% +2.96%
==========================================
Files 53 53
Lines 6946 6946
==========================================
+ Hits 5708 5914 +206
+ Misses 1238 1032 -206
Impacted Files | Coverage Δ | |
---|---|---|
src/gluonnlp/data/loading.py | 78.11% <0.00%> (-3.02%) |
:arrow_down: |
src/gluonnlp/models/mobilebert.py | 87.85% <0.00%> (+0.25%) |
:arrow_up: |
src/gluonnlp/models/albert.py | 95.42% <0.00%> (+0.35%) |
:arrow_up: |
src/gluonnlp/models/transformer.py | 98.93% <0.00%> (+0.42%) |
:arrow_up: |
src/gluonnlp/attention_cell.py | 80.70% <0.00%> (+0.78%) |
:arrow_up: |
src/gluonnlp/data/tokenizers/yttm.py | 82.75% <0.00%> (+0.86%) |
:arrow_up: |
conftest.py | 86.25% <0.00%> (+1.25%) |
:arrow_up: |
src/gluonnlp/data/tokenizers/sentencepiece.py | 78.44% <0.00%> (+2.99%) |
:arrow_up: |
src/gluonnlp/models/bert.py | 94.80% <0.00%> (+7.26%) |
:arrow_up: |
src/gluonnlp/models/gpt2.py | 98.25% <0.00%> (+7.31%) |
:arrow_up: |
... and 6 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 8c8b0c9...e9f6475. Read the comment docs.
Fixing seed is usually not a good strategy as changes in upstream can still change the order of events and thus the results. If necessary, change the tolerance instead.
Actually, the tolerance has been changed to 1E-1 and it can still be flaky. That’s the reason why I fixed the seed. We will need to investigate why it happens in TVM. One possibility is due to some numerical issues caused by the fusion.
Get Outlook for iOShttps://aka.ms/o0ukef
From: Sheng Zha notifications@github.com Sent: Saturday, October 24, 2020 7:33:13 PM To: dmlc/gluon-nlp gluon-nlp@noreply.github.com Cc: Xingjian SHI xshiab@connect.ust.hk; Author author@noreply.github.com Subject: Re: [dmlc/gluon-nlp] [Fix] Improve test (#1399)
Fixing seed is usually not a good strategy as changes in upstream can still change the order of events and thus the results. If necessary, change the tolerance instead.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/dmlc/gluon-nlp/pull/1399#issuecomment-716084026, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABHQH3UUDS3O2BKJNLMP5M3SMOE6TANCNFSM4S56HUYA.
This sounds like a real issue. Let’s proceed as you suggested and mark it experimental and note down the known issue
I think TVM can still be used and may not be marked as experimental. Basically, the issue happens for the ALBERT network and is quite flaky. We need to spend some time to investigate the issue. I’ve also informed @icemelon9 about that.
Get Outlook for iOShttps://aka.ms/o0ukef
From: Sheng Zha notifications@github.com Sent: Saturday, October 24, 2020 8:06:09 PM To: dmlc/gluon-nlp gluon-nlp@noreply.github.com Cc: Xingjian SHI xshiab@connect.ust.hk; Author author@noreply.github.com Subject: Re: [dmlc/gluon-nlp] [Fix] Improve test (#1399)
This sounds like a real issue. Let’s proceed as you suggested and mark it experimental and note down the known issue
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/dmlc/gluon-nlp/pull/1399#issuecomment-716086553, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABHQH3SRJ4Z74FLCDJTUJODSMOI2DANCNFSM4S56HUYA.
We need to take the following into account:
Please reconsider
Marking it as experimental is also valid (The MXNet 2.0 might be itself experimental...... ). Nevertheless, we may encourage our users to try out TVM. According to the current benchmark, TVM can offer 1.2x -- 2.4x speed-up over huggingface + torchscript for fp32 inference. Thus, it is a nice feature to add in our package. We can later provide an example about how to convert GluonNLP models to TVM and do inference on different AWS instances, including C4, C5, G4, P3. In addition, the inference latency can largely benefit from new features like Ansor.
@szha Changed the README to point out that TVM integration is experimental. We may change it back once TVM 0.8 is ready.
The documentation website for preview: http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR1399/improve_test/index.html
The documentation website for preview: http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR1399/improve_test/index.html
Description
Fix seed for the TVM test, which can be flaky
cc @dmlc/gluon-nlp-team