A graphical tool to simplify building and installing DeepSpeed on Windows systems. This tool automates the build process, manages dependencies, and provides clear guidance for CUDA setup. FYI, this was going to be a patcher for earlier builds of DeepSpeed, but I moved the code to be a builder for DeepSpeed 0.15.x and later, but Im too lazy to change the name.
NOTE: If you cannot use the tool for some reason, the manual instructions to perform are at the bottom of this page Manual Builds of DeepSpeed 0.15.0 and later
DeepSpeed wheels are environment-specific and must be built for your exact configuration. A built wheel is tied to:
Python Major Version
PyTorch Version
CUDA Versions (There are TWO different CUDA versions to consider):
a) PyTorch CUDA Version
torch.version.cuda
b) NVIDIA CUDA Toolkit Version
For example:
Environment A (where wheel was built):
- Python 3.11.5
- PyTorch 2.1.0+cu121 (CUDA 12.1)
- NVIDIA CUDA Toolkit 12.1
This wheel will ONLY work in environments with:
- Python 3.11.x (any minor version of 3.11)
- PyTorch 2.1.x (must match major.minor version)
- PyTorch built with CUDA 12.1
- NVIDIA CUDA Toolkit 12.1 or higher
Building DeepSpeed on Windows can be challenging due to specific requirements and environment setup needs. This tool:
NVIDIA GPU with CUDA Support
NVIDIA CUDA Toolkit
Visual Studio with C++ Build Tools
Python Environment
git clone https://github.com/erew123/deepspeedpatcher
⚠️ IMPORTANT: Complete these steps in order:
Install Visual Studio or Visual Studio Build Tools
Install NVIDIA CUDA Toolkit
Set Up Python Environment
python -c "import torch; print(f'PyTorch CUDA available: {torch.cuda.is_available()}, Version: {torch.version.cuda}')"
Activate Your Target Environment
# If using venv
.\venv\Scripts\activate
# If using conda
conda activate your_environment_name
Launch the Tool
python builddeepspeed.py
deepspeed
subdirectory where the tool is rundeepspeed_wheels
with version informationnvcc.exe
each time DeepSpeed starts up. As such, each and every Python environment you are running that uses DeepSpeed will need its CUDA_HOME environment path variable to be set correctly. Guides on doing this are in the tool.After building and installing the wheel to your Python environment, verify DeepSpeed with ds_report
or:
python -c "import deepspeed; deepspeed.show_env()"
Wrong Environment Active
Incorrect Order of Installation
Version Mismatches
Run these commands before starting to verify your setup (these checks are performed by the tool on start-up):
# Check Python version
python --version
# Check PyTorch and its CUDA version
python -c "import torch; print(f'PyTorch {torch.__version__}, CUDA {torch.version.cuda if torch.cuda.is_available() else "Not Available"}')"
# Check CUDA Toolkit
nvcc --version
All these components must be correctly installed and compatible before running the tool.
The tool performs several verification steps on launch:
Administrative Rights Check
Visual Studio Detection
CUDA Toolkit Verification
Python Environment Check
Launch the Application
Configure Build Settings
Build Options
The tool manages builds in an organized way:
Work Directory Structure
root/
├── deepspeed/ # Temporary build directory
└── deepspeed_wheels/ # Archive directory
└── deepspeed_[version]_cuda[version]_py[version]/
└── wheelfile.whl
Cleanup Process
The tool uses a JSON configuration file to manage available versions:
{
"versions": {
"0.15.0": {
"url": "https://github.com/microsoft/DeepSpeed/archive/refs/tags/v0.15.0.zip",
"cuda_min": "11.0"
},
"0.15.1": {
"url": "https://github.com/microsoft/DeepSpeed/archive/refs/tags/v0.15.1.zip",
"cuda_min": "11.0"
}
}
}
To add support for new DeepSpeed versions:
Note: This tool supports DeepSpeed 0.15.0 and later. Earlier versions may have different build requirements and are not supported.
CUDA_HOME Environment
Build Artifacts
Error Handling
Log File
Build Failures
CUDA Issues
Installation Issues
Building DeepSpeed on Windows - Key Requirements and Observations
pip install ninja
) (Required for CUDA compilation)pip install psutil
) (I believe this is needed)Note: If using Visual Studio Code instead of Visual Studio, make sure you've installed the actual Visual Studio Build Tools separately - VS Code's C++ extension default selection alone is not sufficient.
Visual Studio (Full Version):
Visual Studio Code:
C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Auxiliary\Build\vcvars64.bat
After downloading and extracting a DeepSpeed 0.15.x (or later) build, open a Visual Studio x64 Developer Command Prompt as Administrator (or initialize VS environment manually) and then start the desired Python environment you want to build for.
You will need to copy/paste the following commands into the command prompt window.
Set required environment variables to point to your Nvidia CUDA Toolkit just before the bin path where nvcc.exe is located (example below, but you need to confirm your paths to the correct location):
set CUDA_HOME=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1
set CUDA_PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1
set DISTUTILS_USE_SDK=1
You can verify environment is correctly set by testing NVCC and also CL:
nvcc -V # Should show CUDA version
cl # Should show MSVC version
Set DeepSpeed build options to 0
set DS_BUILD_AIO=0
set DS_BUILD_CUTLASS_OPS=0
set DS_BUILD_EVOFORMER_ATTN=0
set DS_BUILD_FP_QUANTIZER=0
set DS_BUILD_RAGGED_DEVICE_OPS=0
set DS_BUILD_SPARSE_ATTN=0
If necessary, you can check your environment variables by running set
at the command prompt you are in to see a list of all environment variables that have been set.
You should now be in your Administrative x64 Developer command prompt/console, with your Python environment that you want to build DeepSpeed for loaded up, your CUDA_HOME and CUDA_PATH set correctly as well as your DeepSpeed build options set, so you can move into your extracted DeepSpeed folder run:
python setup.py bdist_wheel
When this has completed, if successful, there will be a dist
folder within the DeepSpeed folder that contains your compiled wheel. You can install this with pip install deepspeed-xxxxxxxxxx.whl
where the x's will be unique to your environment build.
After building and installing the wheel, verify CUDA availability with ds_report
or a Python Script:
import torch
print(torch.cuda.is_available())
import deepspeed
from deepspeed.ops.op_builder import OpBuilder
torch.cuda.is_available()
should return Trueds_report
should show CUDA is available and correctly configuredBuild process needs VS2019 or newer (VS2019 Build Tools minimum requirement)
CUDA version must be compatible with installed PyTorch version
Wheel files are specific to Python version, CUDA version, and Windows architecture
Building in a clean directory prevents potential conflicts
Set your CUDA_HOME path: DeepSpeed needs to be able to find and access the Nvidia CUDA Toolkit's nvcc.exe
each time DeepSpeed starts up. As such, each and every Python environment you are running that uses DeepSpeed will need its CUDA_HOME environment path variable to be set correctly. Guides on doing this are in the tool.
flowchart TD
A[Start Manual Build Process] --> B[Prerequisites Check]
B --> C[Visual Studio Requirements]
C --> C1[VS2019 or newer with C++ Workload]
C1 --> C2[Required Components Check]
C2 --> C3["MSVC v142 Build Tools
Windows 10 SDK
C++ Core Features
C++/CLI Support
C++ Modules
C++ CMake Tools"]
B --> D[CUDA Toolkit Requirements]
D --> D1[Install Required Components]
D1 --> D2["CUDA Compiler (nvcc)
CUBLAS Development
CUBLAS Runtime"]
B --> E[Python Requirements]
E --> E1["Compatible PyTorch CUDA
Ninja Build System
psutil"]
C3 & D2 & E1 --> F[Environment Setup]
F --> G[Launch VS x64 Developer Command Prompt as Administrator]
G --> I[Environment Variables Required]
I --> I1["CUDA_HOME
CUDA_PATH
DISTUTILS_USE_SDK"]
I1 --> I2["Instructions:
set CUDA_HOME=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1
set CUDA_PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1
set DISTUTILS_USE_SDK=1"]
I2 --> J[DeepSpeed Build Options Required]
J --> J1["DS_BUILD_AIO
DS_BUILD_CUTLASS_OPS
DS_BUILD_EVOFORMER_ATTN
DS_BUILD_FP_QUANTIZER
DS_BUILD_RAGGED_DEVICE_OPS
DS_BUILD_SPARSE_ATTN"]
J1 --> J2["Instructions:
set DS_BUILD_AIO=0
set DS_BUILD_CUTLASS_OPS=0
set DS_BUILD_EVOFORMER_ATTN=0
set DS_BUILD_FP_QUANTIZER=0
set DS_BUILD_RAGGED_DEVICE_OPS=0
set DS_BUILD_SPARSE_ATTN=0"]
J2 --> K[Optional Environment Verification]
K --> |Optional| K1[Test nvcc -V]
K1 --> |Optional| K2[Test cl]
K --> L[Build Process]
K1 --> L
K2 --> L
L --> M[python setup.py bdist_wheel]
M --> N{Build Successful?}
N -->|Yes| O[Wheel file in dist folder]
O --> P[Install with pip]
P --> Q[Verify Installation]
Q --> Q1["Run ds_report
Test torch.cuda.is_available()
Check for CUDA warnings"]
N -->|No| R[Common Issues]
R --> R1["Check VS x64 environment
Verify CUDA paths
Check admin rights
Wait for CUDA compilation"]
R1 --> F
classDef prereq fill:#cce5ff,stroke:#004085,stroke-width:1px;
classDef env fill:#fff3cd,stroke:#856404,stroke-width:1px;
classDef vars fill:#f8d7da,stroke:#721c24,stroke-width:1px;
classDef instructions fill:#d4edda,stroke:#155724,stroke-width:1px;
classDef build fill:#d1ecf1,stroke:#0c5460,stroke-width:1px;
classDef verify fill:#e2e3e5,stroke:#383d41,stroke-width:1px;
classDef error fill:#f8d7da,stroke:#721c24,stroke-width:1px;
classDef optional fill:#e2e3e5,stroke:#383d41,stroke-width:1px,stroke-dasharray: 5 5;
class B,C,C1,C2,C3,D,D1,D2,E,E1 prereq;
class F,G env;
class I,J vars;
class I1,J1 vars;
class I2,J2 instructions;
class L,M,N,O,P build;
class Q,Q1 verify;
class R,R1 error;
class K,K1,K2 optional;
For DeepSpeed-specific issues, refer to:
For tool-specific issues:
flowchart TD
A[Start Application] --> B[Check Admin Rights]
B --> C{Admin Rights?}
C -->|No| D[Prompt for Admin Restart]
D --> E{User Accepts?}
E -->|Yes| F[Restart with Admin]
E -->|No| G[Exit Application]
C -->|Yes| H[Load Configuration JSON]
H --> I{Config Loaded?}
I -->|No| J[Show Error and Exit]
I -->|Yes| K[Initialize GUI]
K --> L[Check Prerequisites]
L --> M[Check Visual Studio]
M --> N{VS Found in Default Path?}
N -->|No| O[Check Registry for VS]
O --> P{VS Found in Registry?}
P -->|No| Q[Show VS Install Instructions]
P -->|Yes| R[Log VS Location]
N -->|Yes| R
L --> S[Check CUDA Toolkit Installation]
S --> T{CUDA Toolkit Found?}
T -->|No| U[Show CUDA Toolki Install Instructions]
T -->|Yes| V[Check for NVCC]
L --> W[Check Python Packages]
W --> X[Check PyTorch]
W --> Y[Check Ninja]
W --> Z[Check psutil]
R & V & X & Y & Z --> AA{All Prerequisites Met?}
AA -->|No| AB[Show Missing Prerequisites]
AA -->|Yes| AC[Enable Build/Install Buttons]
AC --> AD[Wait for User Action]
AD --> AE{Action Selected}
AE -->|Build Only| AF[Start Build Process]
AE -->|Install Only| AG[Start Install Process]
AE -->|Build & Install| AH[Start Combined Process]
AF & AH --> AI[Create Build Directory]
AI --> AJ[Download DeepSpeed from GitHub]
AJ --> AK[Extract ZIP Archive]
AK --> AL[Move Files to Build Dir]
AL --> AM[Create Build Script]
AM --> AN[Set Environment Variables]
AN --> AO[Run Build Script]
AO --> AP{Build Successful?}
AP -->|No| AQ[Show Build Error]
AP -->|Yes| AR[Archive Wheel File]
AR --> AS{Install Requested?}
AS -->|No| AT[Show Build Success]
AS -->|Yes| AU[Uninstall Existing DeepSpeed]
AU --> AV[Install New Wheel]
AV --> AW{Install Successful?}
AW -->|No| AX[Show Install Error]
AW -->|Yes| AY[Show Success Message]
AY --> AZ{Show CUDA Setup?}
AZ -->|Yes| BA[Display CUDA Setup Guide]
AZ -->|No| BB[End Process]
classDef startEnd fill:#f9d5e5,stroke:#333,stroke-width:2px;
classDef process fill:#eeeeee,stroke:#333,stroke-width:1px;
classDef decision fill:#e3f2fd,stroke:#333,stroke-width:1px;
classDef error fill:#ffcdd2,stroke:#333,stroke-width:1px;
classDef success fill:#c8e6c9,stroke:#333,stroke-width:1px;
class A,G,BB startEnd;
class B,H,K,L,M,O,R,V,W,X,Y,Z,AI,AJ,AK,AL,AM,AN,AO,AR,AU,AV process;
class C,E,I,N,P,T,AA,AE,AP,AS,AW,AZ decision;
class J,Q,U,AB,AQ,AX error;
class AT,AY,BA success;