Open zoubaihan opened 1 year ago
Generation related features can be implemented by modifying StreamModel.generate().
However, the original implementation from HF Transformers may require significant modifications to support streaming. This is also the main obstacle that prevents us from achieving feature parity...
Hello, we all know that in huggingface transformers' origin
model.generate()
method, we can set the function paremeterprefix_allowed_tokens_fn
to restrict the generate rule. I want to use this function in basaran just like I used in originmodel.generate()
, could you please tell me where of the source code shall I modify to make the model generation obey my custom prefix_allowed_tokens_fn?