I am working on distilling language models for coding tasks and want to benchmark my distilled model when used as the assistant model for speculative decoding. Currently, there doesn’t seem to be an option to specify a custom assistant model.
I think we could add an assistant_model argument, allowing users to provide the path to their assistant model and passing it to the gen_kwargs. I could make the change for my own local tests, but I was wondering if this feature would be useful enough to integrate into the project.
Hello,
I am working on distilling language models for coding tasks and want to benchmark my distilled model when used as the assistant model for speculative decoding. Currently, there doesn’t seem to be an option to specify a custom assistant model.
I think we could add an assistant_model argument, allowing users to provide the path to their assistant model and passing it to the gen_kwargs. I could make the change for my own local tests, but I was wondering if this feature would be useful enough to integrate into the project.