Open bartekkuncer opened 2 years ago
This change introduces sorting of chunks before executing evaluation to reduce padding to minimum and in this way improve performance.
As every input feature has unique qas_id it can be used for sorting. With the sorting evaluation function goes like this:
Step number 1 is performed so that chunks and their inference results can be easily put in proper order in step number 5 for evaluation in step 6.
Results for max_seq_length=128, doc_stride=32: no sort:
sorted:
Performance did not improve much due to most of the chunks being of same 128 length due to relatively small values of max_seq_length and doc_stride.
Results for max_seq_length=512, doc_stride=128 (default values in run_squad.py script): no sort:
As you can see the performance improved significantly (~20%) without any loss of accuracy.
cc @dmlc/gluon-nlp-team
Description
This change introduces sorting of chunks before executing evaluation to reduce padding to minimum and in this way improve performance.
How the change works
As every input feature has unique qas_id it can be used for sorting. With the sorting evaluation function goes like this:
Step number 1 is performed so that chunks and their inference results can be easily put in proper order in step number 5 for evaluation in step 6.
Performance
Results for max_seq_length=128, doc_stride=32: no sort:
sorted:
Performance did not improve much due to most of the chunks being of same 128 length due to relatively small values of max_seq_length and doc_stride.
Results for max_seq_length=512, doc_stride=128 (default values in run_squad.py script): no sort:
sorted:
As you can see the performance improved significantly (~20%) without any loss of accuracy.
cc @dmlc/gluon-nlp-team