UttaranB127 / speech2affective_gestures

This is the official implementation of the paper "Speech2AffectiveGestures: Synthesizing Co-Speech Gestures with Generative Adversarial Affective Expression Learning".
https://gamma.umd.edu/s2ag/
MIT License
44 stars 9 forks source link

Error when runing main_v2 #21

Open zhewei-mt opened 1 year ago

zhewei-mt commented 1 year ago

Hello, I am trying to run you great work with the command: python main_v2.py -c config/multimodal_context_v2.yml but got an error: File "/home/zhewei.qiu/anaconda3/envs/s2ag-env/lib/python3.7/site-packages/torch/nn/functional.py", line 1237, in leaky_relu result = torch._C._nn.leakyrelu(input, negative_slope) RuntimeError: CUDA error: no kernel image is available for execution on the device

I googled it downgrade my pytorch version to 1.5.0 but the error still persists. Do you know how to fix this error? Any help will be appreciate!!

My torch version: pytorch 1.5.0 py3.7_cuda10.2.89_cudnn7.6.5_0 pytorch torchvision 0.6.0 py37_cu102 pytorch

UttaranB127 commented 1 year ago

Unfortunately, I cannot help much with versioning or cuda backend issues. If it helps, these are the configs I have:

zhewei-mt commented 1 year ago

Thank you for your reply! I upgrade all related version to be consistent with cuda 11.7 and the problem is gone. But at the every last step, another error occurs when executing the following code: ani.save(video_path, fps=15, dpi=80) And here is the error message: MovieWriter stderr: [libopenh264 @ 0x55eec1dec300] Incorrect library version loaded Error initializing output stream 0:0 -- Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height

Traceback (most recent call last): File "/home/zhewei.qiu/anaconda3/envs/s2ag-env/lib/python3.7/site-packages/matplotlib/animation.py", line 234, in saving yield self File "/home/zhewei.qiu/anaconda3/envs/s2ag-env/lib/python3.7/site-packages/matplotlib/animation.py", line 1093, in save writer.grab_frame(savefig_kwargs) File "/home/zhewei.qiu/anaconda3/envs/s2ag-env/lib/python3.7/site-packages/matplotlib/animation.py", line 352, in grab_frame dpi=self.dpi, savefig_kwargs) File "/home/zhewei.qiu/anaconda3/envs/s2ag-env/lib/python3.7/site-packages/matplotlib/figure.py", line 3058, in savefig self.canvas.print_figure(fname, kwargs) File "/home/zhewei.qiu/anaconda3/envs/s2ag-env/lib/python3.7/site-packages/matplotlib/backend_bases.py", line 2325, in print_figure kwargs) File "/home/zhewei.qiu/anaconda3/envs/s2ag-env/lib/python3.7/site-packages/matplotlib/backend_bases.py", line 1648, in wrapper return func(*args, *kwargs) File "/home/zhewei.qiu/anaconda3/envs/s2ag-env/lib/python3.7/site-packages/matplotlib/_api/deprecation.py", line 415, in wrapper return func(inner_args, **inner_kwargs) File "/home/zhewei.qiu/anaconda3/envs/s2ag-env/lib/python3.7/site-packages/matplotlib/backends/backend_agg.py", line 486, in print_raw fh.write(renderer.buffer_rgba()) BrokenPipeError: [Errno 32] Broken pipe

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "main_v2.py", line 149, in s2ag_epoch=290, samples=samples, make_video=True, save_pkl=True) File "/home/zhewei.qiu/audio2body/3d/speech2affective_gestures/processor_v2.py", line 1564, in generate_gestures_by_dataset make_video=make_video, save_pkl=save_pkl) File "/home/zhewei.qiu/audio2body/3d/speech2affective_gestures/processor_v2.py", line 1408, in render_clip delete_audio_file=False) File "/home/zhewei.qiu/audio2body/3d/speech2affective_gestures/utils/gen_utils.py", line 128, in create_video_and_save ani.save(video_path, fps=15, dpi=80) # dpi 150 for a higher resolution File "/home/zhewei.qiu/anaconda3/envs/s2ag-env/lib/python3.7/site-packages/matplotlib/animation.py", line 1093, in save writer.grab_frame(**savefig_kwargs) File "/home/zhewei.qiu/anaconda3/envs/s2ag-env/lib/python3.7/contextlib.py", line 130, in exit self.gen.throw(type, value, traceback) File "/home/zhewei.qiu/anaconda3/envs/s2ag-env/lib/python3.7/site-packages/matplotlib/animation.py", line 236, in saving self.finish() File "/home/zhewei.qiu/anaconda3/envs/s2ag-env/lib/python3.7/site-packages/matplotlib/animation.py", line 342, in finish self._cleanup() # Inline _cleanup() once cleanup() is removed. File "/home/zhewei.qiu/anaconda3/envs/s2ag-env/lib/python3.7/site-packages/matplotlib/animation.py", line 374, in _cleanup self._proc.returncode, self._proc.args, out, err) subprocess.CalledProcessError: Command '['ffmpeg', '-f', 'rawvideo', '-vcodec', 'rawvideo', '-s', '960x320', '-pix_fmt', 'rgba', '-r', '15', '-loglevel', 'error', '-i', 'pipe:', '-vcodec', 'h264', '-pix_fmt', 'yuv420p', '-y', '/home/zhewei.qiu/audio2body/3d/speech2affective_gestures/outputs/genea_challenge_2020/videos_trimodal_style/temp_TestSeq001_s0_0.00_128.83_290_0.mp4']' returned non-zero exit status 1.

Is that because the version of matplotlib? And can you share which version of matplotlib you use?

UttaranB127 commented 1 year ago

Yes, seems like another version issue. I am using matplotlib==3.5.2.

zhewei-mt commented 1 year ago

There is something weird happening. I downgrade matplotlib from 3.5.3 to 3.5.2 and still get same error. But when I try using debug mode is vs code, the script seems to run successfully and I can get the .wav and .mp4 file. Also, can you provide pure inference script, e.g., the input is audio, text, identity and the output is the predict 3d position of all joint?

UttaranB127 commented 1 year ago

It's non-trivial to prepare a "pure" inference script because of all the library, path, and data dependencies. But if you run generate_gestures_by_dataset (beginning line 1441) in processor_v2.py with the dataset set to 'ted_db', that's essentially the default inference script.

zhewei-mt commented 1 year ago

Got you. And is there a way to get the rotation information from the 3d joint position output by your model or do you plan to use rotation information as a supervisor since 3d joint position is not friendly in existing render engine such as UE and blender?

UttaranB127 commented 1 year ago

A good question! I would recommend checking out our blender files inside the blender folder, where we essentially use a form of IK to obtain the rotations and render the motions on 3D meshes. Our approach here is similar to that of https://github.com/ai4r/Gesture-Generation-from-Trimodal-Context, so you can check out their paper and blender project as well for additional context.

zhewei-mt commented 1 year ago

I tried the repo you mentioned. I got the same error when making the result video. Seems that your repo and his are highly related. I tried different version of matplotlib but the error is not gone.

UttaranB127 commented 1 year ago

Is it the same situation where things run in debug mode but not otherwise? Are you able to generate the pose outputs? I'm not sure what the exact issue could be (and I can't seem to replicate it), but maybe you can try other animation scripts with the generated poses outputs.