Closed NickNickGo closed 5 months ago
Yes, it is of interest. The tree-based decoding is already fully supported. The speculative streams and multi-stream attention layers should be possible to support, but I would need an actual model to test with. Not sure if they have released one yet
This issue was closed because it has been inactive for 14 days since being marked as stale.
This might be of interest :
https://huggingface.co/papers/2402.11131