C0nsumption commented 1 week ago

Windows?

feat: Add Compatibility for Windows

Description

This pull request introduces several changes to ensure compatibility with Windows and the most recent versions of various modules. The following modifications have been made:

Module Upgrades:
- Upgraded most modules to their latest versions, including torch, diffusers, etc.
Downgrade NumPy:
- Downgraded NumPy to a version below 2.0.0 to avoid compatibility issues with modules compiled using NumPy 1.x.
```
pip install "numpy<2.0.0"
```
Conditional Import and Usage of deepspeed:
- Added a try-except block to conditionally import deepspeed and add its related arguments only if deepspeed is available.
- This avoids potential ImportError and makes the script more robust.
Enhance Image-Saving Logic:
- Introduced the get_next_index function to handle non-integer filenames during the image-saving process.
- This change ensures that the script can handle filenames that do not conform to an integer pattern without breaking.

Installation Instructions

Clone the Repository:

git clone https://github.com/tencent/HunyuanDiT
cd HunyuanDiT

Install Dependencies:

Using a Virtual Environment

python -m venv venv
venv\Scripts\activate
pip install "huggingface_hub[cli]"
mkdir ckpts
huggingface-cli download Tencent-Hunyuan/HunyuanDiT --local-dir ./ckpts

python.exe -m pip install --upgrade pip

pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu121   
pip install loguru
pip install diffusers
pip install transformers
pip install timm
pip install einops
pip install peft
pip install sentencepiece
pip install protobuf
pip install "numpy<2.0.0"

Note: These versions are not the ones used in the requirements.txt, but they allow the use of CUDA 12.1 and the newest versions of diffusers and pytorch. (Same as ComfyUI I believe)

Using Conda for Closer Compatibility with Original Repository

conda create -n HunyuanDit python=3.9 -c conda-forge -y
conda activate HunyuanDit

# Install CUDA 11.7 from conda-forge
conda install cudatoolkit=11.7 -c conda-forge

set CUDA_HOME=%CONDA_PREFIX%
set PATH=%CUDA_HOME%\bin;%PATH%

python -m pip install --upgrade pip
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117
pip install loguru==0.7.2
pip install diffusers==0.21.2
pip install timm==0.9.5
pip install einops==0.7.0
pip install transformers==4.39.1
pip install peft==0.10.0
pip install "numpy<2.0"
pip install sentencepiece==0.1.99
pip install setuptools==65.5.1
pip install protobuf==3.19.0
pip install wheel
pip install packaging

Run Inference:

python sample_t2i.py --prompt "a woman" --no-enhance

WHY?

Standard install on main repo leads to:

Error Encountered

NumPy Compatibility:

A module that was compiled using NumPy 1.x cannot be run in NumPy 2.0.0 as it may crash. To support both 1.x and 2.x versions of NumPy, modules must be compiled with NumPy 2.0. Some modules may need to rebuild instead e.g. with 'pybind11>=2.12'.

Missing Modules:
- flash_attn: No module named 'flash_attn'
- deepspeed: No module named 'deepspeed'

Solution

Downgrade NumPy:
```
pip install "numpy<2.0.0"
```
Changes in hydit/modules/models.py:
```
use_flash_attn = args.infer_mode == 'fa' or getattr(args, 'use_flash_attn', False)
```
- Explanation: This function call attempts to access args.use_flash_attn. If the attribute does not exist, it returns False instead of raising an AttributeError. This maintains the intended behavior without causing an error.

Changes in hydit/config.py:

try:
    import deepspeed

    # Add DeepSpeed-specific arguments
    parser = deepspeed.add_config_arguments(parser)
    parser.add_argument('--local_rank', type=int, default=None,
                        help='local rank passed from distributed launcher.')
    parser.add_argument('--deepspeed-optimizer', action='store_true',
                        help='Switching to the optimizers in DeepSpeed')
    parser.add_argument('--remote-device', type=str, default='none', choices=['none', 'cpu', 'nvme'],
                        help='Remote device for ZeRO-3 initialized parameters.')
    parser.add_argument('--zero-stage', type=int, default=1)
    parser.add_argument("--async-ema", action="store_true", help="Whether to use multi stream to excut EMA.")
except ImportError:
    print("DeepSpeed not available. Skipping related arguments...")

Explanation: Added a try-except block to conditionally import deepspeed and add its related arguments only if deepspeed is available. This ensures that deepspeed related arguments are only included when deepspeed is successfully imported, avoiding potential ImportError and making the script more robust.

Changes in sample_t2i.py:

def get_next_index(save_dir):
    all_files = list(save_dir.glob('*.png'))
    indices = []
    for f in all_files:
        try:
            indices.append(int(f.stem))
        except ValueError:
            logger.warning(f"Skipping file with non-integer name: {f}")
    return max(indices, default=-1) + 1

# Find the first available index
start = get_next_index(save_dir)

Explanation: Introduced the get_next_index function to handle non-integer filenames gracefully during the image-saving process. This ensures that the script can handle filenames that do not conform to an integer pattern without breaking.

Testing

Only tested inference with:

python sample_t2i.py --prompt "a picture" --no-enhance

Known Bugs

deepspeed and flash_attn are currently not supported and training is not supported.

jroubi commented 5 days ago

Which python version did you used. In the conda environment it is explicitly said 3.9

But did you use the last version of python for the virtual env ?

C0nsumption commented 2 days ago

Which python version did you used. In the conda environment it is explicitly said 3.9

But did you use the last version of python for the virtual env ?

3.10.9

My bad for the delay bud, no github notification. Think it only comes up when quoted. But honestly if you are just doing inference my assumption is you should be fine even with 3.11 and maybe 3.12. Cause all the errors that pop up without these changes are all training related. It has nothing to do with actual inference.

Tencent / HunyuanDiT

feat: Add Compatibility for Windows #101