Closed frankang closed 2 years ago
If you want a big batch during inference, you can export the model and use the end-to-end inference engine: https://github.com/bytedance/lightseq/tree/master/examples/inference/python
If you want a big batch during the evaluation of training models, you can directly reduce the batch size of evaluation
Thanks.
I am currently using lightseq in fairseq. However the inference would fail if the
max_token
value is larger than the value set during the training period. Any suggestion to fix this? Thanks!