It seems that the released code supports only a single input prompt during the inference stage. However, I want to perform batch inference for faster generation. Should I modify the pipeline.py code to achieve this? Or is there any existing code for batch inference that I might have overlooked?
Hi,
It seems that the released code supports only a single input prompt during the inference stage. However, I want to perform batch inference for faster generation. Should I modify the
pipeline.py
code to achieve this? Or is there any existing code for batch inference that I might have overlooked?Best regards