dmlc / gluon-nlp

NLP made easy
https://nlp.gluon.ai/
Apache License 2.0
2.55k stars 538 forks source link

[Fix] Try to fix data pipeline scheduled test (attempt 2) #1483

Closed sxjscience closed 3 years ago

sxjscience commented 3 years ago

Description

It seems that the test is still failing so I'm submitting this PR.

Checklist

Essentials

Changes

Comments

cc @dmlc/gluon-nlp-team

codecov[bot] commented 3 years ago

Codecov Report

Merging #1483 (7cbf74b) into master (52da8ab) will decrease coverage by 0.20%. The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1483      +/-   ##
==========================================
- Coverage   85.87%   85.67%   -0.21%     
==========================================
  Files          52       52              
  Lines        6909     6909              
==========================================
- Hits         5933     5919      -14     
- Misses        976      990      +14     
Impacted Files Coverage Δ
src/gluonnlp/data/loading.py 78.11% <0.00%> (-5.29%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 52da8ab...7cbf74b. Read the comment docs.

github-actions[bot] commented 3 years ago

The documentation website for preview: http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR1483/fix_data_pipeline_test2/index.html

szha commented 3 years ago

What's the error? Is there a way to validate the fix short of committing?

sxjscience commented 3 years ago

https://github.com/dmlc/gluon-nlp/runs/1710545425?check_suite_focus=true

szha commented 3 years ago

The log doesn't seem to show where the encoding error happens so it's a bit unclear whether the patch is the right fix.

sxjscience commented 3 years ago

I'm also not sure. But currently do not have enough time to take a look. My guess is this python encoding issue.