b0b6a commented 7 months ago

Hello,

when i run the following command - python inference_for_demo_video.py \ --wav_path data/audio/acknowledgement_chinese.m4a \ --style_clip_path data/style_clip/3DMM/M030_front_surprised_level3_001.mat \ --pose_path data/pose/RichardShelby_front_neutral_level1_001.mat \ --image_path data/src_img/cropped/zp1.png \ --disable_img_crop \ --cfg_scale 1.0 \ --max_gen_len 30 \ --output_name acknowledgement_chinese@M030_front_surprised_level3_001@zp1

I get the error ffmpeg version 4.2.2 Copyright (c) 2000-2019 the FFmpeg developers built with gcc 7.3.0 (crosstool-NG 1.23.0.449-a04d0) configuration: --prefix=/tmp/build/80754af9/ffmpeg_1587154242452/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placeho --cc=/tmp/build/80754af9/ffmpeg_1587154242452/_build_env/bin/x86_64-conda_cos6-linux-gnu-cc --disable-doc --enable-avresample --enable-gmp --enable-hardcoded-tables --enable-libfreetype --enable-libvpx --enable-pthreads --enable-libopus --enable-postproc --enable-pic --enable-pthreads --enable-shared --enable-static --enable-version3 --enable-zlib --enable-libmp3lame --disable-nonfree --enable-gpl --enable-gnutls --disable-openssl --enable-libopenh264 --enable-libx264 libavutil 56. 31.100 / 56. 31.100 libavcodec 58. 54.100 / 58. 54.100 libavformat 58. 29.100 / 58. 29.100 libavdevice 58. 8.100 / 58. 8.100 libavfilter 7. 57.100 / 7. 57.100 libavresample 4. 0. 0 / 4. 0. 0 libswscale 5. 5.100 / 5. 5.100 libswresample 3. 5.100 / 3. 5.100 libpostproc 55. 5.100 / 55. 5.100 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'data/audio/acknowledgement_english.m4a': Metadata: major_brand : M4A minor_version : 0 compatible_brands: M4A isommp42 creation_time : 2023-12-20T14:25:20.000000Z iTunSMPB : 00000000 00000840 00000000 00000000000C23C0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 Duration: 00:00:16.57, start: 0.044000, bitrate: 246 kb/s Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, mono, fltp, 244 kb/s (default) Metadata: creation_time : 2023-12-20T14:25:20.000000Z handler_name : Core Media Audio Stream mapping: Stream #0:0 -> #0:0 (aac (native) -> pcm_s16le (native)) Press [q] to stop, [?] for help -async is forwarded to lavfi similarly to -af aresample=async=1:min_hard_comp=0.100000:first_pts=0. Output #0, wav, to 'tmp/acknowledgement_english@M030_front_neutral_level1_001@male_face--device=cpu/acknowledgement_english@M030_front_neutral_level1_001@male_face--device=cpu_16K.wav': Metadata: major_brand : M4A minor_version : 0 compatible_brands: M4A isommp42 iTunSMPB : 00000000 00000840 00000000 00000000000C23C0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ISFT : Lavf58.29.100 Stream #0:0(und): Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s (default) Metadata: creation_time : 2023-12-20T14:25:20.000000Z handler_name : Core Media Audio encoder : Lavc58.54.100 pcm_s16le size= 518kB time=00:00:16.57 bitrate= 256.0kbits/s speed=1.23e+03x
video:0kB audio:518kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.014706% Some weights of the model checkpoint at jonatasgrosman/wav2vec2-large-xlsr-53-english were not used when initializing Wav2Vec2Model: ['lm_head.weight', 'lm_head.bias']

This IS expected if you are initializing Wav2Vec2Model from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing Wav2Vec2Model from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). Traceback (most recent call last): File "inference_for_demo_video.py", line 224, in max_audio_len=args.max_gen_len, File "/home/ziyang/anaconda3/envs/dt/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(args, kwargs) File "inference_for_demo_video.py", line 105, in inference_one_video ddim_num_step=ddim_num_step, File "/home/ziyang/dreamtalk-main/core/networks/diffusion_net.py", line 226, in sample ready_style_code=ready_style_code, File "/home/ziyang/dreamtalk-main/core/networks/diffusion_net.py", line 170, in ddim_sample x_t_double, t=t_tensor_double, context_double File "/home/ziyang/anaconda3/envs/dt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(input, kwargs) File "/home/ziyang/dreamtalk-main/core/networks/diffusion_util.py", line 126, in forward style_code = self.style_encoder(style_clip, style_pad_mask) File "/home/ziyang/anaconda3/envs/dt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, *kwargs) File "/home/ziyang/dreamtalk-main/core/networks/generator.py", line 193, in forward style_code = self.aggregate_method(permute_style, pad_mask) File "/home/ziyang/anaconda3/envs/dt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(input, kwargs) File "/home/ziyang/dreamtalk-main/core/networks/self_attention_pooling.py", line 31, in forward att_logits = self.W(batch_rep).squeeze(-1) File "/home/ziyang/anaconda3/envs/dt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, *kwargs) File "/home/ziyang/anaconda3/envs/dt/lib/python3.7/site-packages/torch/nn/modules/container.py", line 119, in forward input = module(input) File "/home/ziyang/anaconda3/envs/dt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(input, **kwargs) File "/home/ziyang/dreamtalk-main/core/networks/mish.py", line 51, in forward return mish(input) RuntimeError: nvrtc: error: invalid value for --gpu-architecture (-arch) nvrtc compilation failed:

define NAN __int_as_float(0x7fffffff)

define POS_INFINITY __int_as_float(0x7f800000)

define NEG_INFINITY __int_as_float(0xff800000)

template device T maximum(T a, T b) { return isnan(a) ? a : (a > b ? a : b); }

template device T minimum(T a, T b) { return isnan(a) ? a : (a < b ? a : b); }

extern "C" global void fused_tanh_mul(float t0, float t1, float aten_mul) { { float v = __ldg(t0 + (((512 blockIdx.x + threadIdx.x) / 65536) 65536 + 256 (((512 blockIdx.x + threadIdx.x) / 256) % 256)) + (512 blockIdx.x + threadIdx.x) % 256); float v_1 = __ldg(t1 + (((512 blockIdx.x + threadIdx.x) / 65536) 65536 + 256 (((512 blockIdx.x + threadIdx.x) / 256) % 256)) + (512 blockIdx.x + threadIdx.x) % 256); aten_mul[(((512 blockIdx.x + threadIdx.x) / 65536) 65536 + 256 (((512 blockIdx.x + threadIdx.x) / 256) % 256)) + (512 blockIdx.x + threadIdx.x) % 256] = v * (tanhf(v_1)); } }

Is there a problem with my GPU? I have no way to solve it.

Best regards, Ziyang Jiao

murphytju commented 7 months ago

I meet the same problem but can't solve it yet. If you get any solutions, can you updata in this issue? Thanks.

b0b6a commented 7 months ago

@murphytju
Of course, but I am still working on it

SuperMaximus1984 commented 5 months ago

Same problem here. I think it might be related to Torch.

ali-vilab / dreamtalk

RuntimeError: nvrtc: error: invalid value for --gpu-architecture (-arch) nvrtc compilation failed: #38

define NAN __int_as_float(0x7fffffff)

define POS_INFINITY __int_as_float(0x7f800000)

define NEG_INFINITY __int_as_float(0xff800000)