Closed arceus-jia closed 1 year ago
The results look weird, like the model is not trained. It usually takes 300~500
steps to train on an 8-frame video. Can you provide more info (e.g, environment, code snippets) for me to look into this issue?
Well, I'm not sure if it's the xformers version conflicts, but after I reinstalled the environment and upgraded torch to 13.1 , torchvision to 0.14.1 and installed the latest xformers version , the retraining result is fine. Anyway, thank you!
Glad to hear that. Let me know if you have any other question. :)
can u share your results after running python -m xformers.info
? I construct a new virtual environment, with torch1.13-cu117+torchvision0.14, but after I install xformers with command pip install -U xformers
, module triton
is not installed. I ran pip install triton
, making it installed. But the results of this repo are still like yours. Wonder how to fix it?
can u share your results after running
python -m xformers.info
? I construct a new virtual environment, with torch1.13-cu117+torchvision0.14, but after I install xformers with commandpip install -U xformers
, moduletriton
is not installed. I ranpip install triton
, making it installed. But the results of this repo are still like yours. Wonder how to fix it?
here is my environment, ,you can refer to it and compare it with yours
absl-py==1.4.0 accelerate==0.16.0 antlr4-python3-runtime==4.9.3 bitsandbytes==0.35.4 cachetools==5.3.0 certifi @ file:///croot/certifi_1671487769961/work/certifi cffi @ file:///tmp/abs_98z5h56wf8/croots/recipe/cffi_1659598650955/work charset-normalizer==3.0.1 decord==0.6.0 diffusers==0.11.1 einops==0.6.0 filelock==3.9.0 flit_core @ file:///opt/conda/conda-bld/flit-core_1644941570762/work/source/flit_core ftfy==6.1.1 future @ file:///home/builder/ci_310/future_1640790123501/work google-auth==2.16.0 google-auth-oauthlib==0.4.6 grpcio==1.51.1 huggingface-hub==0.12.0 idna==3.4 imageio==2.25.0 importlib-metadata==6.0.0 Jinja2==3.1.2 Markdown==3.4.1 MarkupSafe==2.1.2 mkl-fft==1.3.1 mkl-random @ file:///home/builder/ci_310/mkl_random_1641843545607/work mkl-service==2.4.0 modelcards==0.1.6 mypy-extensions==1.0.0 numpy @ file:///croot/numpy_and_numpy_base_1672336185480/work nvidia-cublas-cu11==11.10.3.66 nvidia-cuda-nvrtc-cu11==11.7.99 nvidia-cuda-runtime-cu11==11.7.99 nvidia-cudnn-cu11==8.5.0.96 oauthlib==3.2.2 omegaconf==2.3.0 packaging==23.0 Pillow==9.4.0 protobuf==3.20.3 psutil==5.9.4 pyasn1==0.4.8 pyasn1-modules==0.2.8 pycparser @ file:///tmp/build/80754af9/pycparser_1636541352034/work pyre-extensions==0.0.23 PyYAML @ file:///croot/pyyaml_1670514731622/work regex==2022.10.31 requests==2.28.2 requests-oauthlib==1.3.1 rsa==4.9 six @ file:///tmp/build/80754af9/six_1644875935023/work tensorboard==2.11.2 tensorboard-data-server==0.6.1 tensorboard-plugin-wit==1.8.1 tokenizers==0.13.2 torch==1.13.1 torchvision==0.14.1 tqdm==4.64.1 transformers==4.26.0 typing-inspect==0.8.0 typing_extensions @ file:///croot/typing_extensions_1669924550328/work urllib3==1.26.14 wcwidth==0.2.6 Werkzeug==2.2.2 xformers==0.0.17.dev444 zipp==3.12.1
Thank you for your response. I upgrade xformers from 0.0.16 to 0.0.17. Upgraded model generates as follows:
This seems better? But many discordances exist.
This seems better? But many discordances exist.
Yep, that means the training was successful. In fact the sample given by the author is similar to this one. The author mainly provide an idea for ai-generated animation with diffusion model, but if you want to productize it, it still needs a lot of improvement
I tried the 100,000 steps training, but the results still look strange, is this normal? Can you tell me how many steps I need to take to achieve the right result? Thank you!