Open ngoel17 opened 1 year ago
These lines are the key:
File "/mnt/dsk1/22feb/lhotse/lhotse/features/io.py", line 765, in <listcomp>
decompressed_chunks = [lilcom.decompress(data) for data in chunk_data]
File "~/anaconda3/envs/k2_feb23/lib/python3.9/site-packages/lilcom/lilcom_interface.py", line 110, in decompress
raise ValueError("Something went wrong in decompression (likely bad data): "
ValueError: Something went wrong in decompression (likely bad data): decompress_float returned 7
I think you may have corrupted data, did all the feature extraction jobs / scripts complete successfully?
Yes. Feature extraction scripts ran completely and did not throw any errors. However, we get exactly the same messages on two other datasets also.
On Tue, Feb 21, 2023 at 2:10 PM Piotr Żelasko @.***> wrote:
These lines are the key:
File "/mnt/dsk1/22feb/lhotse/lhotse/features/io.py", line 765, in
decompressed_chunks = [lilcom.decompress(data) for data in chunk_data] File "~/anaconda3/envs/k2_feb23/lib/python3.9/site-packages/lilcom/lilcom_interface.py", line 110, in decompress raise ValueError("Something went wrong in decompression (likely bad data): " ValueError: Something went wrong in decompression (likely bad data): decompress_float returned 7 I think you may have corrupted data, did all the feature extraction jobs / scripts complete successfully?
— Reply to this email directly, view it on GitHub https://github.com/k2-fsa/icefall/issues/918#issuecomment-1438968368, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACDHE6A577VZA5TXK7VN7S3WYUHKNANCNFSM6AAAAAAVDKLFCE . You are receiving this because you authored the thread.Message ID: @.***>
Hmmm, I am not sure what happened then. Here's a few long shots, maybe one of them would work:
fault_tolerant
feature loading mode for these cases).Yeah. do you have a preference for a decompression method? There is also this environment variable regarding protobuf that helps some people but probably hurt us.
I will try to see if I can find more pointers on the three suggestions. As far as I know, no updates were done on one system but not another. At the moment we are not 100% sure if the problem is bad data at the time of feature extraction or a load problem, and if its really in the data or the code.
By the way, did you restart the feature extraction at some point because of some error?
By the way, did you restart the feature extraction at some point because of some error?
We didnt. We ran the librispeech/ASR/prepare.sh without any modification and it did all stages in one go.
We noticed this error while running the icefall training on some other dataset. Did a fresh install and ran librispeech recipe and replicated the same error that seems to be triggering from Lhotse handling the data. LOG file is attached. libri.log