coreweave / ml-containers

MIT License
19 stars 3 forks source link

ci(torch-nightly): Fix DeepSpeed compilation #41

Closed Eta0 closed 11 months ago

Eta0 commented 11 months ago

Fix DeepSpeed Compilation

This change disables the AIO extension in DeepSpeed from being compiled on torch v2.2 and up, since it is not currently compatible with extension compilation requirements added in torch v2.1. The AIO extension had previously been disabled only for torch v2.1 in torch-extras/Dockerfile, but the latest nightly builds of torch are now onto v2.2, and the issue is still present as of the latest DeepSpeed release.

This also bumps the version of DeepSpeed from v0.10.1 to v0.10.3.

github-actions[bot] commented 11 months ago

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/6292057821 Image: ghcr.io/coreweave/ml-containers/nightly-torch:es-extras-updates-c8ebbd7-nccl-2023.09.24.19-cuda12.0.1-nccl2.18.3-1-torch2.2.0a0-vision0.17.0a0-audio2.2.0a0

github-actions[bot] commented 11 months ago

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/6292057821 Image: ghcr.io/coreweave/ml-containers/nightly-torch:es-extras-updates-c8ebbd7-nccl-2023.09.24.19-cuda11.8.0-nccl2.16.2-1-torch2.2.0a0-vision0.17.0a0-audio2.2.0a0

github-actions[bot] commented 11 months ago

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/6292057821 Image: ghcr.io/coreweave/ml-containers/nightly-torch:es-extras-updates-c8ebbd7-base-2023.09.24.19-cuda12.0.1-torch2.2.0a0-vision0.17.0a0-audio2.2.0a0

github-actions[bot] commented 11 months ago

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/6292057821 Image: ghcr.io/coreweave/ml-containers/nightly-torch:es-extras-updates-c8ebbd7-base-2023.09.24.19-cuda12.1.1-torch2.2.0a0-vision0.17.0a0-audio2.2.0a0

github-actions[bot] commented 11 months ago

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/6292057821 Image: ghcr.io/coreweave/ml-containers/nightly-torch:es-extras-updates-c8ebbd7-base-2023.09.24.19-cuda11.8.0-torch2.2.0a0-vision0.17.0a0-audio2.2.0a0

github-actions[bot] commented 11 months ago

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/6292057821 Image: ghcr.io/coreweave/ml-containers/nightly-torch:es-extras-updates-c8ebbd7-nccl-2023.09.24.19-cuda12.1.1-nccl2.18.3-1-torch2.2.0a0-vision0.17.0a0-audio2.2.0a0

github-actions[bot] commented 11 months ago

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/6292057821 Image: ghcr.io/coreweave/ml-containers/nightly-torch-extras:es-extras-updates-c8ebbd7-nccl-2023.09.24.19-cuda12.0.1-nccl2.18.3-1-torch2.2.0a0-vision0.17.0a0-audio2.2.0a0-flash_attn2.0.2

github-actions[bot] commented 11 months ago

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/6292057821 Image: ghcr.io/coreweave/ml-containers/nightly-torch-extras:es-extras-updates-c8ebbd7-nccl-2023.09.24.19-cuda12.0.1-nccl2.18.3-1-torch2.2.0a0-vision0.17.0a0-audio2.2.0a0-flash_attn1.0.9

github-actions[bot] commented 11 months ago

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/6292057821 Image: ghcr.io/coreweave/ml-containers/nightly-torch-extras:es-extras-updates-c8ebbd7-nccl-2023.09.24.19-cuda11.8.0-nccl2.16.2-1-torch2.2.0a0-vision0.17.0a0-audio2.2.0a0-flash_attn1.0.9

github-actions[bot] commented 11 months ago

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/6292057821 Image: ghcr.io/coreweave/ml-containers/nightly-torch-extras:es-extras-updates-c8ebbd7-nccl-2023.09.24.19-cuda11.8.0-nccl2.16.2-1-torch2.2.0a0-vision0.17.0a0-audio2.2.0a0-flash_attn2.0.2

github-actions[bot] commented 11 months ago

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/6292057821 Image: ghcr.io/coreweave/ml-containers/nightly-torch-extras:es-extras-updates-c8ebbd7-base-2023.09.24.19-cuda12.0.1-torch2.2.0a0-vision0.17.0a0-audio2.2.0a0-flash_attn2.0.2

github-actions[bot] commented 11 months ago

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/6292057821 Image: ghcr.io/coreweave/ml-containers/nightly-torch-extras:es-extras-updates-c8ebbd7-base-2023.09.24.19-cuda12.0.1-torch2.2.0a0-vision0.17.0a0-audio2.2.0a0-flash_attn1.0.9

github-actions[bot] commented 11 months ago

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/6292057821 Image: ghcr.io/coreweave/ml-containers/nightly-torch-extras:es-extras-updates-c8ebbd7-base-2023.09.24.19-cuda12.1.1-torch2.2.0a0-vision0.17.0a0-audio2.2.0a0-flash_attn1.0.9

github-actions[bot] commented 11 months ago

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/6292057821 Image: ghcr.io/coreweave/ml-containers/nightly-torch-extras:es-extras-updates-c8ebbd7-base-2023.09.24.19-cuda12.1.1-torch2.2.0a0-vision0.17.0a0-audio2.2.0a0-flash_attn2.0.2

github-actions[bot] commented 11 months ago

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/6292057821 Image: ghcr.io/coreweave/ml-containers/nightly-torch-extras:es-extras-updates-c8ebbd7-base-2023.09.24.19-cuda11.8.0-torch2.2.0a0-vision0.17.0a0-audio2.2.0a0-flash_attn2.0.2

github-actions[bot] commented 11 months ago

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/6292057821 Image: ghcr.io/coreweave/ml-containers/nightly-torch-extras:es-extras-updates-c8ebbd7-base-2023.09.24.19-cuda11.8.0-torch2.2.0a0-vision0.17.0a0-audio2.2.0a0-flash_attn1.0.9

github-actions[bot] commented 11 months ago

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/6292057821 Image: ghcr.io/coreweave/ml-containers/nightly-torch-extras:es-extras-updates-c8ebbd7-nccl-2023.09.24.19-cuda12.1.1-nccl2.18.3-1-torch2.2.0a0-vision0.17.0a0-audio2.2.0a0-flash_attn1.0.9

github-actions[bot] commented 11 months ago

@Eta0 Build complete, success: https://github.com/coreweave/ml-containers/actions/runs/6292057821 Image: ghcr.io/coreweave/ml-containers/nightly-torch-extras:es-extras-updates-c8ebbd7-nccl-2023.09.24.19-cuda12.1.1-nccl2.18.3-1-torch2.2.0a0-vision0.17.0a0-audio2.2.0a0-flash_attn2.0.2