hyperonym / basaran

Basaran is an open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Transformers-based text generation models.
MIT License
1.29k stars 81 forks source link

How to set my own parameters in model.generate() in basaran? #193

Open zoubaihan opened 1 year ago

zoubaihan commented 1 year ago

Hello, I want use my customize parameters when model.generate(), like this:

model.generate(input_ids, max_new_tokens=max_new_tokens,
                                     do_sample=True, max_length=max_length, temperature=temperature, top_p=top_p,
                                     repetition_penalty=repetition_penalty)

but if I use basaran, the code is like this:

model = load_model(model_name)
for choice in model(input_code):
      yield choice

It seems no place I can set parameters like do_sample, max_length, top_p, ..., just like I use model.generate() directly. So that I can not set those parameters by myself. How to solve this problem?

peakji commented 1 year ago

Hi @zoubaihan, you can specify parameters such as top_p and max_tokens when calling the StreamModel instance obtained using the load_model function. However, we haven't implemented a streaming version for all parameters in HF Transformers yet, so parameters like do_sample are currently not supported.

Here's the full list of supported params: https://github.com/hyperonym/basaran#completions

zoubaihan commented 1 year ago

OK, thank you, I hope one day it could support all parameters of model.generate() !