huggingface / transformers-bloom-inference

Fast Inference Solutions for BLOOM
Apache License 2.0
561 stars 114 forks source link

Short response for bloom inferring #70

Closed raihan0824 closed 1 year ago

raihan0824 commented 1 year ago

Why the model always generating less than 100 tokens even though I instruct it to use more than 100?

mayank31398 commented 1 year ago

how are you using it? There is a parameter you can set in the Makefile, which is set to 100 by default https://github.com/huggingface/transformers-bloom-inference/blob/7bea3526d8270b4aeeefecc57d7d7d638e2bbe0e/inference_server/server.py#L36

raihan0824 commented 1 year ago

well noted!

mayank31398 commented 1 year ago

closing this