Closed ZibbeZabbe closed 2 years ago
Issue adressed in PR #8
Thanks for letting me know of this issue! I'll try with your fix, and see if all the code is correctly executed, but it looks like everything should still work.
I've found a solution that worked on both my Windows 10 and Ubuntu 18.04 machines: only use the default
channels and specify the channel for each dependency, like so:
name: stylegan3
channels:
- defaults
dependencies:
- python >= 3.8
- pip
- numpy>=1.20
- click>=8.0
- pillow=8.3.1
- scipy=1.7.1
- pytorch::pytorch>=1.9.1
- nvidia::cudatoolkit>=11.1 # PR #116 by @edstoica
- requests=2.26.0
- tqdm=4.62.2
- ninja=1.10.2
- matplotlib=3.4.2
- imageio=2.9.0
- pip:
- imgui==1.3.0
- glfw==2.2.0
- pyopengl==3.1.5
- imageio-ffmpeg==0.4.3
- pyspng
- psutil # PR #125 by @fastflair / #111 by @siddharthksah
- tensorboard # PR #125 by @fastflair
- moviepy==1.0.3
- ffmpeg-python==0.2.0
- scikit-video==1.1.11
- setuptools==59.5.0
Test it out and let me know if it works for you. If it does, you can change it on your PR and I'll accept it. Thanks again for pointing out the bad environment creation!
Using pytorch::pytorch>=1.9.1
resulted in version 1.11.0 for me which has the issue described in #145 on the NVlabs issue.
Specifying <=1.10.2
should solve that issue.
Unfortunately, with channel default I still have the issue of not getting the CUDA version of pytorcn (only shows up when attempting to train)
as such specifying pytorch=1.10.2=py3.9_cuda11.3_cudnn8_0
has been the only reliable way I found to ensure CUDA compiled pytorch is grabbed. Its not a perfect solution as this may not work with other versions of python but it is functional.
Fixed in: 2d0a7c2
In short: thanks to the last fix in the NVlabs repository for NVlabs#145, we also change cudatoolkit=11.1
in environment.yml
and the environment is correctly created in both Windows and Ubuntu 18.04. I've tested the code and we can generate images/videos, as well as train with it, so let me know if there's anything else to fix!
The environment will not build when starting from a clean slate.
Within environment.yml, changing nvidia::cudatoolkit=11.3
to cudatoolkit=11.3
allowed conda to build the enviroment.
Describe the bug Creating environment results in pytorch CPU being downloaded Clip by openAI addition results in torch 1.7.1 being downloaded, unsure if that was cause for pytorch CPU version
To Reproduce I would run "Conda clean -a" and "pip cache purge" Then attemp to build environment. Doing so would not allow me to train using "python train.py --outdir=C:\AI\output\stylegan --cfg=stylegan3-r --data=C:\AI\data\data-512x512.zip --gpus=1 --batch=12 --gamma=8.2 --mirror=1" or similar commands
Expected behavior running train.py not erroring out
Screenshots
Desktop (please complete the following information):