fudan-generative-vision / hallo

Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
https://fudan-generative-vision.github.io/hallo/
MIT License
7.26k stars 932 forks source link

AttributeError: 'tuple' object has no attribute 'shape' #145

Open A-2-H opened 2 weeks ago

A-2-H commented 2 weeks ago

I am using wsl Ubuntu 22.04.4 LTS and I am facing this problem:

The config attributes {'center_input_sample': False, 'out_channels': 4} were passed to UNet2DConditionModel, but are not expected and will be ignored. Please verify your config.json configuration file.
Some weights of the model checkpoint were not used when initializing UNet2DConditionModel:
 ['conv_norm_out.bias, conv_norm_out.weight, conv_out.bias, conv_out.weight']
INFO:hallo.models.unet_3d:loaded temporal unet's pretrained weights from pretrained_models/stable-diffusion-v1-5/unet ...
The config attributes {'center_input_sample': False} were passed to UNet3DConditionModel, but are not expected and will be ignored. Please verify your config.json configuration file.
Load motion module params from pretrained_models/motion_module/mm_sd_v15_v2.ckpt
INFO:hallo.models.unet_3d:Loaded 453.20928M-parameter motion module
loaded weight from  ./pretrained_models/hallo/net.pth
Traceback (most recent call last):
  File "/mnt/s/aiprograms/hallo-1.0.0/scripts/inference.py", line 424, in <module>
    inference_process(
  File "/mnt/s/aiprograms/hallo-1.0.0/scripts/inference.py", line 300, in inference_process
    audio_emb = process_audio_emb(audio_emb)
  File "/mnt/s/aiprograms/hallo-1.0.0/scripts/inference.py", line 109, in process_audio_emb
    for i in range(audio_emb.shape[0]):
AttributeError: 'tuple' object has no attribute 'shape'

also I've got the same problem on anaconda env in windows aswell.

crystallee-ai commented 1 week ago

Please recheck the usability of the driving audio. If you find no issues with the audio, please upload more runtime logs.

A-2-H commented 1 week ago

Hi! Thank you for respond. Actually there is no problem with audio at all. I did generated videos with this audio before I reinstalled the env. Also with different audio I get the same error. I managed to fix it by downloading the hallo-windows portable version and I replaced the site-packages in my env with the one in the hallo-win-portable zip file. After that I reinstalled the pytorch and onnxruntime and it worked. So I assume it has to do something with the packages in my env, some newer packages doesn't work with hallo. I couldn't find which one are causing this. I want to add that I used "requirements.txt" to install all of necessary dependencies but still it installed something in different version that doesn't work with hallo script.

Do you need full log of it? I can still provide it with wsl because I didn't fix it there... Also I can show you pip list if that would help to fix the problem.