jy0205 / LaVIT

LaVIT: Empower the Large Language Model to Understand and Generate Visual Content
Other
438 stars 22 forks source link

Got noisy gif file #22

Closed lochuynh1412 closed 2 months ago

lochuynh1412 commented 2 months ago

Hi, thanks for awesome work.

I tried to run the Text-to-Video Generation example. The task finished but the gif file was super noisy. I attached the logs, hope you guys can help :) logs.txt

lochuynh1412 commented 2 months ago

generated_text_img Add gif file

jy0205 commented 2 months ago

What's your GPU type? Please make sure your device support xformers.

lochuynh1412 commented 2 months ago

I'm using A100. xformers should support A100 right?

jy0205 commented 2 months ago

Yes, of course, can you show me your environment info?

lochuynh1412 commented 2 months ago

req.txt

sure, attached the libs info, running python 3.9 now. I use this command "conda list -e > req.txt" to export that file. if you need other info, please let me know.

Added conda.yaml file conda.txt

jy0205 commented 2 months ago

The environment seems fine. Can you generate the normal keyframe? you can see the keyframe by using "display(keyframes[0][0])"

lochuynh1412 commented 2 months ago

@jy0205 Keyframe looks good, is it due to ffmpeg? keyframe

I'm using ffmpeg from conda ffmpeg version 4.3 Copyright (c) 2000-2020 the FFmpeg developers built with gcc 7.3.0 (crosstool-NG 1.23.0.449-a04d0) configuration: --prefix=/opt/conda/conda-bld/ffmpeg_1597178665428/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placeh --cc=/opt/conda/conda-bld/ffmpeg_1597178665428/_build_env/bin/x86_64-conda_cos6-linux-gnu-cc --disable-doc --disable-openssl --enable-avresample --enable-gnutls --enable-hardcoded-tables --enable-libfreetype --enable-libopenh264 --enable-pic --enable-pthreads --enable-shared --disable-static --enable-version3 --enable-zlib --enable-libmp3lame libavutil 56. 51.100 / 56. 51.100 libavcodec 58. 91.100 / 58. 91.100 libavformat 58. 45.100 / 58. 45.100 libavdevice 58. 10.100 / 58. 10.100 libavfilter 7. 85.100 / 7. 85.100 libavresample 4. 0. 0 / 4. 0. 0 libswscale 5. 7.100 / 5. 7.100 libswresample 3. 7.100 / 3. 7.100 Hyper fast Audio and Video encoder

It does seem to have mpeg part 2 as mentioned V.S... mpeg1video MPEG-1 video
V.S... mpeg2video MPEG-2 video
V.S... mpeg4 MPEG-4 part 2
V..... msmpeg4v2 MPEG-4 part 2 Microsoft variant version 2 V..... msmpeg4 MPEG-4 part 2 Microsoft variant version 3 (codec msmpeg4v3)

jy0205 commented 2 months ago

The generation doesn't use the ffmpeg. We have double-checked the released code and checkpoints. In my machine, the video generation works. I guess it is an environmental problem. Maybe the Xformers version. Mine version is xformers 0.0.19.

generated

jy0205 commented 2 months ago

Here is part of my dependencies:

scikit-image 0.21.0 scikit-learn 1.2.2 scipy 1.10.1 semantic-version 2.10.0 Send2Trash 1.8.2 sentencepiece 0.1.99 sentry-sdk 1.26.0 setproctitle 1.3.2 setuptools 67.6.1 shortuuid 1.0.11 six 1.16.0 smart-open 6.3.0 smmap 5.0.0 sniffio 1.3.0 soupsieve 2.4.1 spacy 3.5.4 spacy-legacy 3.0.12 spacy-loggers 1.0.4 SQLAlchemy 2.0.18 srsly 2.4.6 stack-data 0.6.2 starlette 0.27.0 streamlit 1.24.0 svgwrite 1.4.3 tenacity 8.2.2 tensorboard 2.12.2 tensorboard-data-server 0.7.0 tensorboard-plugin-wit 1.8.1 tensorboardX 2.6.1 termcolor 2.3.0 terminado 0.17.1 text-unidecode 1.3 thinc 8.1.10 threadpoolctl 3.1.0 tifffile 2023.4.12 tiktoken 0.3.3 timm 0.4.12 tinycss2 1.2.1 tokenizers 0.13.3 toml 0.10.2 tomli 2.0.1 toolz 0.12.0 torch 1.13.1+cu117 torch-fidelity 0.3.0 torchaudio 0.13.1+cu117 torchmetrics 1.1.2 torchvision 0.14.1+cu117 tornado 6.3.2 tqdm 4.65.0 traitlets 5.9.0 transaction 3.1.0 transformers 4.33.2 translationstring 1.4 triton 2.0.0.post1 typer 0.9.0 typing_extensions 4.9.0 typing-inspect 0.8.0 tzdata 2023.3 tzlocal 4.3.1 uc-micro-py 1.0.2 uri-template 1.3.0 urllib3 1.26.15 uvicorn 0.22.0 validators 0.20.0 vector-quantize-pytorch 1.7.1 velruse 1.1.1 venusian 3.0.0 virtualenv 20.21.0 wandb 0.16.2 wasabi 1.1.2 watchdog 3.0.0 wavedrom 2.0.3.post3 wcwidth 0.2.6 webcolors 1.13 webdataset 0.2.48 webencodings 0.5.1 WebOb 1.8.7 websocket-client 1.6.1 websockets 11.0.3 Werkzeug 2.2.3 wheel 0.40.0 widgetsnbextension 4.0.7 WTForms 3.0.1 wtforms-recaptcha 0.3.2 xformers 0.0.19+b68a5a9.d20230414 yacs 0.1.8 yarl 1.9.2 zipp 3.15.0

lochuynh1412 commented 2 months ago

@jy0205 can you share the pip command to install that specific version of xformer?

jy0205 commented 2 months ago

pip install xformers==0.0.19

lochuynh1412 commented 2 months ago

Let me try again, last night I did use that command to install xformers and somehow it forced pytorch to be updated to 2.0 and then the demo failed to run.

lochuynh1412 commented 2 months ago

@jy0205 oops, it worked now. Not sure what is different this time. but thank you anyway. It's xformers issue