coreweave / ml-containers

MIT License
19 stars 3 forks source link

feat(vllm-tensorizer): Update `vllm-tensorizer` cloned repository, build with `vllm-flash-attn`, other optimizations #72

Open sangstar opened 2 months ago

sangstar commented 2 months ago

vllm-tensorizer hasn't had updates since vLLM's formal adoption of tensorizer model loading. An update to build for the most recent commit to vLLM that includes sharded tensorizer support is presented, along with some fixes to successfully build vLLM with recent updates to the source code. These include: