for a large dataset about 10M QA pairs
would it be a better performance on accuracy if we divide the dataset by the length of sentences.
and feed it to different training model and decoding it accordingly(maybe different parameters on RNN size, layers for the different model) ?
for a large dataset about 10M QA pairs would it be a better performance on accuracy if we divide the dataset by the length of sentences. and feed it to different training model and decoding it accordingly(maybe different parameters on RNN size, layers for the different model) ?
any comments!!!!