fastnlp / fastNLP

fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
https://gitee.com/fastnlp/fastNLP
Apache License 2.0
3.07k stars 448 forks source link

Flaky test fix #352

Closed sleepy-owl closed 3 years ago

sleepy-owl commented 3 years ago

Hi,

The test test_ConstantTokenNumSampler sometimes fails when the value of sum(batch_x['seq_len']) equals exactly 120. This PR fixes this issue.

To find a solution, I collected samples from several test executions and computed the tail distribution. I computed the extreme percentiles to check how high can the values be. The 99.99th percentile seems to be converging to 120. Changing the operator to '<=' will solve this issue.

Do you guys think this makes sense? Please let me know if this looks good or if you have any other suggestions. Also, here I assume there are no bugs in the code under test.