nateraw / stable-diffusion-videos

Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts
Apache License 2.0
4.4k stars 421 forks source link

Colab Music Video error: tuple index out of range #89

Closed juancopi81 closed 1 year ago

juancopi81 commented 1 year ago

Hi,

I am trying to use Colab to generate a video using an mp3 file. Still, I keep getting this error:

IndexError                                Traceback (most recent call last)
[<ipython-input-16-593e08c3b40c>](https://localhost:8080/#) in <module>
     10     batch_size=1,                          # increase until you go out of memory
     11     output_dir='./dreams',                 # Where images will be saved
---> 12     name=None,                             # Subdir of output dir. will be timestamp by default
     13 )
     14 visualize_video_colab(video_path)

2 frames
[/usr/local/lib/python3.7/dist-packages/stable_diffusion_videos/stable_diffusion_pipeline.py](https://localhost:8080/#) in walk(self, prompts, seeds, num_interpolation_steps, output_dir, name, image_file_ext, fps, num_inference_steps, guidance_scale, eta, height, width, upsample, batch_size, resume, audio_filepath, audio_start_sec)
    803                 audio_offset=audio_offset,
    804                 audio_duration=audio_duration,
--> 805                 sr=44100,
    806             )
    807 

[/usr/local/lib/python3.7/dist-packages/stable_diffusion_videos/stable_diffusion_pipeline.py](https://localhost:8080/#) in make_video_pyav(frames_or_frame_dir, audio_filepath, fps, audio_offset, audio_duration, sr, output_filepath, glob_pattern)
    145             audio_fps=sr,
    146             audio_codec="aac",
--> 147             options={"crf": "10", "pix_fmt": "yuv420p"},
    148         )
    149     else:

[/usr/local/lib/python3.7/dist-packages/torchvision/io/video.py](https://localhost:8080/#) in write_video(filename, video_array, fps, video_codec, options, audio_array, audio_fps, audio_codec, audio_options)
    114             num_channels = audio_array.shape[0]
    115             audio_layout = "stereo" if num_channels > 1 else "mono"
--> 116             audio_sample_fmt = container.streams.audio[0].format.name
    117 
    118             format_dtype = np.dtype(audio_format_dtypes[audio_sample_fmt])

IndexError: tuple index out of range

I am not sure if there is something I should change...

Laubs commented 1 year ago

I am algo getting the same error running it locally.

nateraw commented 1 year ago

Oh darn...looking into it now. Thanks so much for reporting, folks!!

For now, you can use the pypi version before the latest, and it should work fine (0.5.3).

nateraw commented 1 year ago

Just looked into it a bit in colab. mystified as to why this is happening. I am assuming its a version issue with PyAV/torchvision...I don't think anything else changed in the codebase that would effect this (but obviously I could be wrong).

Writing videos w/o audio works fine, but audio is not working.

Underlying fn is torchvision.io.write_video if anybody else cares to take a look.

nateraw commented 1 year ago

Seems av had new release on October 17th (10.0.0). That's probably it.

nateraw commented 1 year ago

Downgrading to version before that worked fine. pip install av==9.2.0. Will pin this version in reqs here and release new version of the library

juancopi81 commented 1 year ago

Thanks a lot @nateraw!! :) I'll create a new video today