Open sarah-zhu opened 6 years ago
Hi, we recently found out that the audio part is not runnable on Pytorch version >=0.4.0, please check your environment to be the same as ours (Pytorch 0.2.0 with python 2.7).
@sarah-zhu Sorry that there was a mistake in our testing data loading code, we have now fixed it. Thank you for pointing it out.
@Hangz-nju-cuhk Thanks for fixing it! Is the audio part runnable on Pytorch version >=0.4.0 now?
@sarah-zhu there could be a bug on version >= 0.4.0 that needs fixing. We recommend using Pytorch version < 0.4.0 for now. I will be working on releasing the new version for Pytorch 0.4.1 in a month.
sorry, when I run the testing script to generate videos from audio use the modified code, I still face the following mistake:
Traceback (most recent call last):
File "test_all.py", line 34, in
when I remove the code ` if require_audio:
k4 = 0
for mfcc_num in pair:
# for s in range(-1,2):
mfcc_path = os.path.join(path, str(mfcc_num) + '.bin')
if os.path.exists(mfcc_path):
mfcc = np.fromfile(mfcc_path)
mfcc = mfcc.reshape(20, 12)
mfcc_block[k4, 0, :, :] = mfcc
k4 += 1
# else:
# raise ("mfccs = 0")`
in “Dataloader/Test_load_audio.py”, the code work!
@duantiantian I am a bit confused for I just checked the code, it runs normally on my computer. The error means no audio files can be found, and you just removed the part for loading audio files, then how can it work? ><
@Hangz-nju-cuhk Thanks. I tested and it works on 0.4.0. Another problem: does this model also work for sentence level audio inputs?
@sarah-zhu I am not sure what do you mean by "sentence-level audio inputs", maybe our kind of method is segment-level audio inputs? We did crop the mfcc by time shifts to let it align with images, so if this process can be realised online in the dataloader, then it is possible to work with a sentence level mfcc feature.
@Hangz-nju-cuhk Sorry, I just comment out "else: raise ("mfccs = 0")", I thought after loading all the audio files, the code will raise an error rather break the for loop when the next audio file does not exist.
Thank you very much for your reply!!! ^-^
@Hangz-nju-cuhk Hi, I tested your model on both LRW and Voxceleb datasets using provided mfcc extraction and face alignment code. However, I found the result is more accurate on Voxceleb dataset than on LRW. Is the provided model trained on Voxceleb dataset?
@sarah-zhu I am glad that you did further experiments!
If I remember correctly, this model is trained on the training set of LRW only, so were the results provided in our paper and videos. However, we did further experiments to train our model on a small sample of Voxceleb (from the first 100 classes), and we did not find any difference in our testing results on Voxceleb or LRW comparing with before.
I think the main reason is that Voxceleb provides normally cleaner audio clips than LRW, and humans are less sensitive to long-term results in Voxceleb than shorter ones in LRW.
I'm facing a similar issue. Using pytorch (0.2.0) and python 2.7. The only difference is that I'm not using CUDA. Can this work without CUDA?
@Hangz-nju-cuhk getting this error using audio in google drive bin files
Traceback (most recent call last):
File "test_all.py", line 34, in
@ak9250 @duantiantian modifiy raise ("mfccs = 0")
to raise Exception("mfccs = 0")
can fix the bug
@duantiantian I am a bit confused for I just checked the code, it runs normally on my computer. The error means no audio files can be found, and you just removed the part for loading audio files, then how can it work? ><
Hello, the problem is not code, the problem is command line
change from
python test_all.py --test_root ./0572_0019_0003/audio --test_type audio --test_audio_video_length 99 --test_resume_path ./checkpoints/101_DAVS_checkpoint.pth.tar
to
python test_all.py --test_root ./0572_0019_0003/audio --test_type audio --test_audio_video_length 98 --test_resume_path ./checkpoints/101_DAVS_checkpoint.pth.tar
Because there are only 98 mfcc bin files
Hi~ @Hangz-nju-cuhk When I use 'audio' as the test_type, I got the static video all the time. All the generated images are the same. However when I use 'video' as the test_type, it works fine. Do you have any idea for this? Thank you!