Batch Inference支持吗

shuxueslpi / chatGLM-6B-QLoRA

使用peft库，对chatGLM-6B/chatGLM2-6B实现4bit的QLoRA高效微调，并做lora model和base model的merge及4bit的量化（quantize）。

356 stars 46 forks source link

Closed ThreeStonesSL closed 1 year ago

ThreeStonesSL commented 1 year ago

代码里好像只能单条获得输出，可以batch infer吗或者有相关代码参考嘛

shuxueslpi commented 1 year ago

batch inference应该都是支持的，参考这种