Open sunchaesk opened 4 months ago
I was wondering what would be the best way to split the execution of flexgen to prefill and decode only.
How should I save the values from prefill and how should I load them when I am running flexgen again for decode only.
Thanks
I was wondering what would be the best way to split the execution of flexgen to prefill and decode only.
How should I save the values from prefill and how should I load them when I am running flexgen again for decode only.
Thanks