josebeo2016 / biosegment

The supporting project for BTS-E
0 stars 0 forks source link

Skipping Audios #5

Closed HildaNya closed 2 weeks ago

HildaNya commented 1 month ago

Hi, I tried replicating your results and the embedding generator works perfectly on ASV and TIMIT datasets. However, when I tried using it on LJSpeech (https://huggingface.co/datasets/flexthink/ljspeech), the embedding generator only works on part of the dataset and skips over samples. For the skipped (unsuccessful) audios, the failure seems to happen when going through VectorDataSource from GMM_breath. The error message is "negative dimensions are not allowed". These skipped audios don't have any special patterns: in terms of duration, sample rate, and encoder, they don't differ from the successful audios.

I know this might be a very specific bug on my part. But I was wondering if similar situations happened when you were developing the system?

josebeo2016 commented 2 weeks ago

Sorry for the late reply, I think the problem is about the way libsoundfile load the audio. Please do some conversion to same as the ASVspoof data (interm of bitrate, sample rate, PCM code using ffmpeg) before doing your experiment.