How to use set different max_batch_tokens during inference?

bytedance / lightseq

LightSeq: A High Performance Library for Sequence Processing and Generation

Other

3.22k stars 329 forks source link

How to use set different max_batch_tokens during inference? #295

Closed frankang closed 2 years ago

frankang commented 2 years ago

I am currently using lightseq in fairseq. However the inference would fail if the max_token value is larger than the value set during the training period. Any suggestion to fix this? Thanks!

/home/sherryvine/miniconda3/envs/pytorch1.8/lib/python3.8/site-packages/lightseq/training/ops/pytorch/transformer_embedding_layer.py
line 179, in forward
    raise ValueError(
ValueError: Batch token numbers 5000 exceeds the limit 3000.

neopro12 commented 2 years ago

If you want a big batch during inference, you can export the model and use the end-to-end inference engine: https://github.com/bytedance/lightseq/tree/master/examples/inference/python

If you want a big batch during the evaluation of training models, you can directly reduce the batch size of evaluation

frankang commented 2 years ago

Thanks.