Tokenizer Errors On MLS Spanish Dataset

I have run lhotse prepare on the MLS Spanish dataset and created the manifest file. I've also added a way to select MLS dataset partitions to the bin/tokenizer.py file (which works fine), but when I run on CUDA I get the following error: cuda error

If i try to run the tokenizer without cuda I also get the following assertion errors and the script stops running after only processing one partition: cpu error

Can you tell me what output I am supposed to see after running the tokenizer? I get a single folder with a bunch of h5 files that are all 800 bytes?

lifeiteng / vall-e

Tokenizer Errors On MLS Spanish Dataset #164