interactiveaudiolab / ppgs

High-Fidelity Neural Phonetic Posteriorgrams
https://maxrmorrison.com/sites/ppgs
MIT License
91 stars 6 forks source link

from_audio and from _files_to_files output in different length. #15

Closed lixuyuan102 closed 2 months ago

lixuyuan102 commented 2 months ago

this is my code:

path = "/home/lixuyuan/vox2_audio/vox2/clean//id01660/PLNtqNHjI7Y+00048.flac"#root + f loudness, pitch, periodicity, ppg, speaker = from_file(path, 2, features = ['loudness', 'pitch', 'periodicity', 'ppg', "speaker"]) print(ppg.size()) print(loudness.shape) ppgs.from_files_to_files( [path], ["aahere.pt"], num_workers=1, max_frames=5000, gpu=2, checkpoint = "./ckpt/ppgs/mel-800k.pt") data = torch.load("aahere.pt") print(data.shape)

this is the output

torch.Size([1, 40, 815]) torch.Size([8, 815]) ppgs: 100%|██████████████████████| 1/1 [00:00<00:00, 73.43it/s] torch.Size([40, 947])

It seems like from _files_to_files() get an audio with wrong sample rate or something others.

lixuyuan102 commented 2 months ago

sorry, I have ignored the resample in from_audio.