coreweave / ml-containers

MIT License
19 stars 3 forks source link

feat(torch): Updates & Stability Fixes #67

Closed Eta0 closed 3 months ago

Eta0 commented 3 months ago

Updates for PyTorch, Apex, DeepSpeed, Flash Attention, and nccl-tests Base Images, Fix for Triton in torch-nightly, and Caching Improvements

What a mouthful of a title!

Updates

This change contains the following library version updates:

Additionally, the torch:nccl build with Ubuntu 20.04 × CUDA 12.2.2 now uses an updated base image featuring NCCL v2.21.5-1

Stability

Fixes

Performance