hyperonym / basaran

Basaran is an open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Transformers-based text generation models.
MIT License
1.29k stars 81 forks source link

I want use the function prefix_allowed_tokens_fn, where of basaran's source code shall I modify? #220

Open zoubaihan opened 1 year ago

zoubaihan commented 1 year ago

Hello, we all know that in huggingface transformers' origin model.generate() method, we can set the function paremeterprefix_allowed_tokens_fn to restrict the generate rule. I want to use this function in basaran just like I used in origin model.generate(), could you please tell me where of the source code shall I modify to make the model generation obey my custom prefix_allowed_tokens_fn?

peakji commented 1 year ago

Generation related features can be implemented by modifying StreamModel.generate().

However, the original implementation from HF Transformers may require significant modifications to support streaming. This is also the main obstacle that prevents us from achieving feature parity...