coreweave / ml-containers

MIT License
19 stars 3 forks source link

build(torch-extras): Fix `xformers` & Limit Concurrency #48

Closed Eta0 closed 8 months ago

Eta0 commented 9 months ago

Fix xformers & Limit Concurrency on torch-extras Builds

This change updates flash-attention to v2.3.6, and removes the flash-attention v1.0.9 builds. xformers's bundled flash-attention is disabled so that it uses the manually built one, because xformers's own flash-attention build currently segfaults on CUDA 12.1.

This change also adds hard limits on the number of threads used during torch-extras compilation jobs to avoid excessive RAM use.

github-actions[bot] commented 9 months ago

@wbrown Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/6962388870 Image: ghcr.io/coreweave/ml-containers/torch-extras:es-limit-concurrency-a628650-nccl-cuda11.8.0-nccl2.16.2-1-torch2.0.1-vision0.15.2-audio2.0.2-flash_attn1.0.9

github-actions[bot] commented 9 months ago

@wbrown Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/6962388870 Image: ghcr.io/coreweave/ml-containers/torch-extras:es-limit-concurrency-a628650-nccl-cuda11.8.0-nccl2.16.2-1-torch2.0.1-vision0.15.2-audio2.0.2-flash_attn1.0.9

github-actions[bot] commented 8 months ago

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/7132227698 Image: ghcr.io/coreweave/ml-containers/torch-extras:es-limit-concurrency-7ade069-nccl-cuda11.8.0-nccl2.16.2-1-torch2.0.1-vision0.15.2-audio2.0.2-flash_attn2.3.6

github-actions[bot] commented 8 months ago

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/7132227698 Image: ghcr.io/coreweave/ml-containers/torch-extras:es-limit-concurrency-7ade069-base-cuda12.1.1-torch2.0.1-vision0.15.2-audio2.0.2-flash_attn2.3.6

github-actions[bot] commented 8 months ago

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/7132227698 Image: ghcr.io/coreweave/ml-containers/torch-extras:es-limit-concurrency-7ade069-nccl-cuda12.0.1-nccl2.18.5-1-torch2.0.1-vision0.15.2-audio2.0.2-flash_attn2.3.6

github-actions[bot] commented 8 months ago

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/7132227698 Image: ghcr.io/coreweave/ml-containers/torch-extras:es-limit-concurrency-7ade069-nccl-cuda12.2.2-nccl2.18.5-1-torch2.0.1-vision0.15.2-audio2.0.2-flash_attn2.3.6

github-actions[bot] commented 8 months ago

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/7132227698 Image: ghcr.io/coreweave/ml-containers/torch-extras:es-limit-concurrency-7ade069-nccl-cuda12.1.1-nccl2.18.3-1-torch2.0.1-vision0.15.2-audio2.0.2-flash_attn2.3.6