Open liwu96 opened 7 months ago
我看了下代码,process.py指定--asr参数为wav2vec会调用python nerf/asr.py这个,这个路径已经没了,是否应该修改为nerf_triplane/asr.py呢?extract_audio_features这里我不加--asr参数会遇到上面的报错,希望得到您的回复 def extract_audio_features(path, mode='wav2vec'):
print(f'[INFO] ===== extract audio labels for {path} =====')
if mode == 'wav2vec':
cmd = f'python nerf/asr.py --wav {path} --save_feats'
else: # deepspeech
cmd = f'python data_utils/deepspeech_features/extract_ds_features.py --input {path}'
os.system(cmd)
print(f'[INFO] ===== extracted audio labels =====')
def extract_audio_features(path, mode='wav2vec'):
print(f'[INFO] ===== extract audio labels for {path} =====')
if mode == 'wav2vec':
# cmd = f'python nerf/asr.py --wav {path} --save_feats'
cmd = f'python data_utils/wav2vec.py --wav {path} --save_feats'
else: # deepspeech
cmd = f'python data_utils/deepspeech_features/extract_ds_features.py --input {path}'
os.system(cmd)
print(f'[INFO] ===== extracted audio labels =====')
请教下 我在执行python data_utils/process.py data//.mp4方法时遇到报错
Traceback (most recent call last):
File "/home/work/ER-NeRF/data_utils/deepspeech_features/extract_ds_features.py", line 131, in
main()
File "/home/work/ER-NeRF/data_utils/deepspeech_features/extract_ds_features.py", line 107, in main
extract_features(
File "/home/work/ER-NeRF/data_utils/deepspeech_features/extract_ds_features.py", line 80, in extract_features
conv_audios_to_deepspeech(
File "/home/work/ER-NeRF/data_utils/deepspeech_features/deepspeech_features.py", line 53, in conv_audios_to_deepspeech
ds_features = pure_conv_audio_to_deepspeech(
File "/home/work/ER-NeRF/data_utils/deepspeech_features/deepspeech_features.py", line 149, in pure_conv_audio_to_deepspeech
input_vector = conv_audio_to_deepspeech_input_vector(
File "/home/work/ER-NeRF/data_utils/deepspeech_features/deepspeech_features.py", line 220, in conv_audio_to_deepspeech_input_vector
features = np.concatenate((empty_context, features, empty_context))
File "<__array_function__ internals>", line 180, in concatenate
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 26 and the array at index 1 has size 23
这个报错是因为视频中的音频存在问题吗?我的素材来自新闻联播中,我尝试过将视频中的音频通过ffmpeg转为16000采样率,但是没有作用。