Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
MIT License
1.09k
stars
113
forks
source link
Pretraining SpeechT5, meet problems about batch_sampler in multitask_dataset. Should I get idx and bin files of data one by one (wav) or get all of them in only two file(idx and bin each have one) #53
I found it raised error in the process of batching samples of the .index and .bin data provided by fairseq-preprocess. And here is what my batch_sampler shape looks like. There are 455 items in batch_sampler and each item has 6 items in it except the last one :
So in order to run successfully, I tried to give up the last row:
batch_sampler = batch_sampler[:-2]
But then I got this:
I think it is caused by the function np.random.choice(). And I infer from it that the batch_sampler should be a list, which only contains one array in it, right?
But I have no idea how it comes out, should the index and bin files containing all train_data or just one row of train data?
What's the sampled object of the batch_sampler?
Here is what my directory:
I would really appreciative to you if you can explain this. Thank you!!!!!!!
Hi, I want to pretrain a model using SpeechT5 arch. I follow the scripts you given here https://github.com/microsoft/SpeechT5/tree/main/SpeechT5#data-preparation. But I wonder if there is a restrict in fairseq-preprocess when preparing data. Because I met this error.
I found it raised error in the process of batching samples of the .index and .bin data provided by fairseq-preprocess. And here is what my batch_sampler shape looks like. There are 455 items in batch_sampler and each item has 6 items in it except the last one :
So in order to run successfully, I tried to give up the last row:
Here is what my directory:
I would really appreciative to you if you can explain this. Thank you!!!!!!!