The test test_ConstantTokenNumSampler sometimes fails when the value of sum(batch_x['seq_len']) equals exactly 120. This PR fixes this issue.
To find a solution, I collected samples from several test executions and computed the tail distribution. I computed the extreme percentiles to check how high can the values be. The 99.99th percentile seems to be converging to 120. Changing the operator to '<=' will solve this issue.
Do you guys think this makes sense? Please let me know if this looks good or if you have any other suggestions. Also, here I assume there are no bugs in the code under test.
Hi,
The test
test_ConstantTokenNumSampler
sometimes fails when the value ofsum(batch_x['seq_len'])
equals exactly 120. This PR fixes this issue.To find a solution, I collected samples from several test executions and computed the tail distribution. I computed the extreme percentiles to check how high can the values be. The 99.99th percentile seems to be converging to 120. Changing the operator to '<=' will solve this issue.
Do you guys think this makes sense? Please let me know if this looks good or if you have any other suggestions. Also, here I assume there are no bugs in the code under test.