lhotse-speech / lhotse

Tools for handling speech data in machine learning projects.
https://lhotse.readthedocs.io/en/latest/
Apache License 2.0
923 stars 210 forks source link

Error in decompressing LilcomChunkyWriter feature manifest #607

Closed desh2608 closed 2 years ago

desh2608 commented 2 years ago

I used kaldifeat to extract some features and stored them using the default storage type, which is LilcomChunkyWriter, but it seemed to be throwing some errors at the time of data loading:

Traceback (most recent call last):
  File "/home/draj/anaconda3/envs/scale/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 202, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/draj/anaconda3/envs/scale/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 46, in fetch
    data = self.dataset[possibly_batched_index]
  File "/export/c07/draj/mini_scale_2022/lhotse/lhotse/dataset/speech_recognition.py", line 113, in __getitem__
    input_tpl = self.input_strategy(cuts)
  File "/export/c07/draj/mini_scale_2022/lhotse/lhotse/dataset/input_strategies.py", line 120, in __call__
    return collate_features(
  File "/export/c07/draj/mini_scale_2022/lhotse/lhotse/dataset/collation.py", line 138, in collate_features
    features[idx] = _read_features(cut)
  File "/export/c07/draj/mini_scale_2022/lhotse/lhotse/dataset/collation.py", line 477, in _read_features
    return torch.from_numpy(cut.load_features())
  File "/export/c07/draj/mini_scale_2022/lhotse/lhotse/utils.py", line 632, in wrapper
    raise type(e)(
ValueError: Something went wrong in decompression (likely bad data): decompress_float returned 7
[extra info] When calling: MonoCut.load_features(args=(MonoCut(id='0fc802cd6f15cb7e6324709659edd6e7_109-60-0', start=0, duration=11.58, channel=0, supervisions=[SupervisionSegment(id='0fc802cd6f15cb7e6324709659edd6e7_109', recording_id='0fc802cd6f15cb7e6324709659edd6e7_109', start=0, duration=11.58, channel=0, text='as the tenure of the leadership team has increased we have been able to initiate positive changes throughout our business structure that have directly contributed to our recent successes', language='English', speaker='0fc802cd6f15cb7e6324709659edd6e7', gender=None, custom=None, alignment=None)], features=Features(type=kaldifeat-fbank, num_frames=1158, num_features=80, frame_shift=0.01, sampling_rate=16000, start=0, duration=11.58, storage_type=lilcom_chunky, storage_path=data/fbank/feats_dev.lca, storage_key=5047284,44760,44501,14302, recording_id=None, channels=0), recording=Recording(id='0fc802cd6f15cb7e6324709659edd6e7_109', sources=[AudioSource(type=file, channels=[0], source=/export/c07/draj/mini_scale_2022/icefall/egs/spgispeech/ASR/download/spgispeech/spgispeech/train/0fc802cd6f15cb7e6324709659edd6e7/109.wav)], sampling_rate=16000, num_samples=185280, duration=11.58, transforms=None), custom=None),) kwargs={})

When I switched to using LilcomHdf5Writer, the data loading was successful.

pzelasko commented 2 years ago

Is it possible your job failed during extraction and left the file corrupted? Can you retry and see?

desh2608 commented 2 years ago

Ok, I'll try again.

pzelasko commented 2 years ago

Did it help?

desh2608 commented 2 years ago

Sorry I didn't get time to get back to this, but it must have been a file corruption as you mention. You can close this issue for now. If I run into the error again, I'll reopen it.

luomingshuang commented 2 years ago

I also meet this issue.

(k2-python) luomingshuang@de-74279-k2-train-2-0602201035-5fb6d86964-mclm7:~/codes/icefall-pruned-rnnt5-aishell4/egs/aishell4/ASR$ CUDA_VISIBLE_DEVICES='4' python pruned_transducer_stateless5/train.py --max-duration 200
2022-06-07 15:47:16,465 INFO [train.py:880] Training started
2022-06-07 15:47:16.660707: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /ceph-sh1/fangjun/software/cuda-10.2.89/lib:/ceph-sh1/fangjun/software/cuda-10.2.89/lib64:
2022-06-07 15:47:16.660748: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2022-06-07 15:47:18,838 INFO [train.py:890] Device: cuda:0
2022-06-07 15:47:18,901 INFO [lexicon.py:176] Loading pre-compiled data/lang_char/Linv.pt
2022-06-07 15:47:18,911 INFO [train.py:901] {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 50, 'reset_interval': 200, 'valid_interval': 3000, 'feature_dim': 80, 'subsampling_factor': 4, 'model_warm_step': 3000, 'env_info': {'k2-version': '1.15.1', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': 'f8d2dba06c000ffee36aab5b66f24e7c9809f116', 'k2-git-date': 'Thu Apr 21 12:20:34 2022', 'lhotse-version': '1.3.0.dev+git.5dbc5fb.dirty', 'torch-version': '1.11.0', 'torch-cuda-available': True, 'torch-cuda-version': '10.2', 'python-version': '3.8', 'icefall-git-branch': 'icefall-pruned-rnnt5-aishell4', 'icefall-git-sha1': 'b4b3a84-dirty', 'icefall-git-date': 'Tue Jun 7 12:20:12 2022', 'icefall-path': '/ceph-meixu/luomingshuang/icefall', 'k2-path': '/ceph-ms/luomingshuang/k2_latest/k2/python/k2/__init__.py', 'lhotse-path': '/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/lhotse-1.3.0.dev0+git.5dbc5fb.dirty-py3.8.egg/lhotse/__init__.py', 'hostname': 'de-74279-k2-train-2-0602201035-5fb6d86964-mclm7', 'IP address': '10.177.74.202'}, 'world_size': 1, 'master_port': 12354, 'tensorboard': True, 'num_epochs': 30, 'start_epoch': 1, 'start_batch': 0, 'exp_dir': PosixPath('pruned_transducer_stateless5/exp'), 'lang_dir': 'data/lang_char', 'initial_lr': 0.003, 'lr_batches': 5000, 'lr_epochs': 6, 'context_size': 2, 'prune_range': 5, 'lm_scale': 0.25, 'am_scale': 0.0, 'simple_loss_scale': 0.5, 'seed': 42, 'print_diagnostics': False, 'save_every_n': 4000, 'keep_last_k': 30, 'average_period': 100, 'use_fp16': False, 'num_encoder_layers': 24, 'dim_feedforward': 1536, 'nhead': 8, 'encoder_dim': 384, 'decoder_dim': 512, 'joiner_dim': 512, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 200, 'bucketing_sampler': True, 'num_buckets': 300, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'drop_last': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'input_strategy': 'PrecomputedFeatures', 'training_subset': 'L', 'blank_id': 0, 'vocab_size': 3284}
2022-06-07 15:47:18,912 INFO [train.py:903] About to create model
2022-06-07 15:47:19,386 INFO [train.py:907] Number of model parameters: 94337552
2022-06-07 15:47:22,629 INFO [asr_datamodule.py:429] About to get train cuts
2022-06-07 15:47:22,631 INFO [asr_datamodule.py:231] About to get Musan cuts
2022-06-07 15:47:22,632 INFO [asr_datamodule.py:238] Enable MUSAN
2022-06-07 15:47:22,725 INFO [asr_datamodule.py:263] Enable SpecAugment
2022-06-07 15:47:22,726 INFO [asr_datamodule.py:264] Time warp factor: 80
2022-06-07 15:47:22,726 INFO [asr_datamodule.py:276] Num frame mask: 10
2022-06-07 15:47:22,726 INFO [asr_datamodule.py:289] About to create train dataset
2022-06-07 15:47:22,726 INFO [asr_datamodule.py:318] Using DynamicBucketingSampler.
2022-06-07 15:47:26,453 INFO [asr_datamodule.py:334] About to create train dataloader
2022-06-07 15:47:26,454 INFO [asr_datamodule.py:437] About to get dev cuts
2022-06-07 15:47:26,456 INFO [asr_datamodule.py:365] About to create dev dataset
2022-06-07 15:47:27,138 INFO [asr_datamodule.py:384] About to create dev dataloader
2022-06-07 15:47:27,139 INFO [train.py:1056] Sanity check -- see if any of the batches in epoch 1 would cause OOM.
2022-06-07 15:51:56,549 INFO [train.py:818] Epoch 1, batch 0, loss[loss=1.001, simple_loss=2.002, pruned_loss=9.132, over 4887.00 frames.], tot_loss[loss=1.001, simple_loss=2.002, pruned_loss=9.132, over 4887.00 frames.], batch size: 21, lr: 3.00e-03
Traceback (most recent call last):
  File "pruned_transducer_stateless5/train.py", line 1108, in <module>
    main()
  File "pruned_transducer_stateless5/train.py", line 1101, in main
    run(rank=0, world_size=1, args=args)
  File "pruned_transducer_stateless5/train.py", line 1010, in run
    train_one_epoch(
  File "pruned_transducer_stateless5/train.py", line 750, in train_one_epoch
    for batch_idx, batch in enumerate(train_dl):
  File "/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 530, in __next__
    data = self._next_data()
  File "/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1204, in _next_data
    return self._process_data(data)
  File "/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1250, in _process_data
    data.reraise()
  File "/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/torch/_utils.py", line 457, in reraise
    raise exception
ValueError: Caught ValueError in DataLoader worker process 1.
Original Traceback (most recent call last):
  File "/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/lhotse-1.3.0.dev0+git.5dbc5fb.dirty-py3.8.egg/lhotse/utils.py", line 668, in wrapper
    return fn(*args, **kwargs)
  File "/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/lhotse-1.3.0.dev0+git.5dbc5fb.dirty-py3.8.egg/lhotse/cut.py", line 1012, in load_features
    feats = self.features.load(start=self.start, duration=self.duration)
  File "/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/lhotse-1.3.0.dev0+git.5dbc5fb.dirty-py3.8.egg/lhotse/features/base.py", line 476, in load
    return storage.read(
  File "/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/lhotse-1.3.0.dev0+git.5dbc5fb.dirty-py3.8.egg/lhotse/caching.py", line 70, in wrapper
    return m(*args, **kwargs)
  File "/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/lhotse-1.3.0.dev0+git.5dbc5fb.dirty-py3.8.egg/lhotse/features/io.py", line 771, in read
    decompressed_chunks = [lilcom.decompress(data) for data in chunk_data]
  File "/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/lhotse-1.3.0.dev0+git.5dbc5fb.dirty-py3.8.egg/lhotse/features/io.py", line 771, in <listcomp>
    decompressed_chunks = [lilcom.decompress(data) for data in chunk_data]
  File "/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/lilcom-1.1.1-py3.8-linux-x86_64.egg/lilcom/lilcom_interface.py", line 110, in decompress
    raise ValueError("Something went wrong in decompression (likely bad data): "
ValueError: Something went wrong in decompression (likely bad data): decompress_float returned 7

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/lhotse-1.3.0.dev0+git.5dbc5fb.dirty-py3.8.egg/lhotse/utils.py", line 668, in wrapper
    return fn(*args, **kwargs)
  File "/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/lhotse-1.3.0.dev0+git.5dbc5fb.dirty-py3.8.egg/lhotse/cut.py", line 2872, in load_features
    base_feats=first_cut.load_features(),
  File "/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/lhotse-1.3.0.dev0+git.5dbc5fb.dirty-py3.8.egg/lhotse/utils.py", line 670, in wrapper
    raise type(e)(
ValueError: Something went wrong in decompression (likely bad data): decompress_float returned 7
[extra info] When calling: MonoCut.load_features(args=(MonoCut(id='69666566-1e79-46b5-ae86-c73d26eda59d', start=1939.8653125, duration=3.535, channel=0, supervisions=[SupervisionSegment(id='20200707_L_R001S07C01-SPK0177-143', recording_id='20200707_L_R001S07C01', start=0.0, duration=3.535, channel=0, text='对这也可以在咱们设计宣传页中体现出', language='Chinese', speaker='SPK0177', gender=None, custom=None, alignment=None)], features=Features(type='kaldi-fbank', num_frames=223855, num_features=80, frame_shift=0.01, sampling_rate=16000, start=0, duration=2238.554, storage_type='lilcom_chunky', storage_path='data/fbank/aishell4_feats_train_L/feats-5.lca', storage_key='797975384,46595,46375,46457,46408,46563,46683,46719,46616,45869,44638,45440,45598,45592,44962,45612,46175,44952,44938,44666,45372,45229,44912,44773,45222,44842,44967,45484,45312,45762,45720,46269,45907,46186,46074,46283,46580,45918,44998,45119,44809,44815,44615,44822,44484,44668,44980,45307,45602,45636,45669,45597,45419,45595,45568,45688,45555,46001,45410,45187,44985,46054,45766,44680,45160,45437,45731,46178,45777,46002,45430,45774,45508,45607,45505,45738,44946,44732,45281,45890,46158,45194,45484,45316,45576,44806,45299,45430,45505,45932,45477,44793,45140,44698,45305,45468,46065,45868,45849,45683,45864,45407,45634,46099,45804,46225,45586,45589,45845,45658,45585,44877,45860,45552,44743,45230,45123,46095,45979,46122,45948,45807,45357,45773,45241,44594,44507,45265,45872,45457,45102,45390,45317,45873,45779,45660,45487,44590,45481,46056,45564,45164,45000,45162,44766,44628,45742,45932,45514,45767,45822,45545,44907,44240,45270,45092,45141,45922,44126,45229,45343,44974,44018,44664,45718,44832,45345,44689,45270,45688,45906,44212,45336,45651,45752,44795,45129,44917,45324,45434,45298,44926,45122,44794,45252,45854,45871,44660,45290,44580,46171,44381,44333,44655,44384,45193,44942,44984,45365,45147,45831,45381,45753,44592,44494,45098,45170,45721,44971,44686,44869,45455,45011,46233,44870,45506,45455,45066,45717,44937,45364,45233,45239,45215,44724,44604,45446,44499,45024,44799,44636,44436,44524,44373,45059,44609,45225,44817,44926,44324,44811,44160,44394,45847,45564,45961,45409,45629,45762,45759,45483,45613,45045,44888,45412,45932,45129,46027,45235,44754,43787,44224,45244,45222,44772,45550,45506,45002,44973,44726,44237,44234,44513,44593,44835,44201,44281,43984,44225,44515,44884,44253,44481,43522,44864,45413,45525,45401,45233,45005,45100,44748,44313,44330,44643,44545,44631,44501,44515,43939,44776,45289,44528,44402,44203,44072,44230,44891,44757,43897,43421,43903,43765,44196,44277,44728,45135,45103,45543,45341,44757,44321,44378,43670,44565,44375,44968,45008,44743,44379,45684,45651,45533,44468,44779,44644,45846,44885,45056,45142,44884,44252,44656,45742,44937,45197,44820,45036,44552,44769,44576,44603,45867,45366,45072,45020,44681,44796,44957,44571,44825,44523,44798,45431,45534,45005,45764,44081,43906,43892,43376,44185,44806,44602,44452,44907,45514,45405,44902,45389,45084,44749,44436,45262,44642,45378,45248,45310,44934,45465,45322,45478,45482,45449,45105,45319,44841,44184,44618,44921,44671,45026,44705,44812,45221,45197,44454,44560,44411,45337,44555,45846,44964,44370,45302,45229,45607,44252,45233,44530,45164,44898,44741,44868,45233,45783,45827,45681,45903,45427,45427,45102,45116,45119,45126,44763,45568,45445,45235,44944,45834,45269,44512,45209,45243,45052,44781,44552,32881', recording_id='None', channels=0), recording=Recording(id='20200707_L_R001S07C01', sources=[AudioSource(type='file', channels=[0, 1, 2, 3, 4, 5, 6, 7], source='/ceph-ms/luomingshuang/codes/icefall-pruned-rnnt5-aishell4/egs/aishell4/ASR/download/aishell4/train_L/wav/20200707_L_R001S07C01.flac')], sampling_rate=16000, num_samples=35816864, duration=2238.554, transforms=None), custom=None),) kwargs={})

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
    data = fetcher.fetch(index)
  File "/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
    data = self.dataset[possibly_batched_index]
  File "/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/lhotse-1.3.0.dev0+git.5dbc5fb.dirty-py3.8.egg/lhotse/dataset/speech_recognition.py", line 113, in __getitem__
    input_tpl = self.input_strategy(cuts)
  File "/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/lhotse-1.3.0.dev0+git.5dbc5fb.dirty-py3.8.egg/lhotse/dataset/input_strategies.py", line 120, in __call__
    return collate_features(
  File "/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/lhotse-1.3.0.dev0+git.5dbc5fb.dirty-py3.8.egg/lhotse/dataset/collation.py", line 138, in collate_features
    features[idx] = _read_features(cut)
  File "/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/lhotse-1.3.0.dev0+git.5dbc5fb.dirty-py3.8.egg/lhotse/dataset/collation.py", line 514, in _read_features
    return torch.from_numpy(cut.load_features())
  File "/ceph-meixu/luomingshuang/anaconda3/envs/k2-python/lib/python3.8/site-packages/lhotse-1.3.0.dev0+git.5dbc5fb.dirty-py3.8.egg/lhotse/utils.py", line 670, in wrapper
    raise type(e)(
ValueError: Something went wrong in decompression (likely bad data): decompress_float returned 7
[extra info] When calling: MonoCut.load_features(args=(MonoCut(id='69666566-1e79-46b5-ae86-c73d26eda59d', start=1939.8653125, duration=3.535, channel=0, supervisions=[SupervisionSegment(id='20200707_L_R001S07C01-SPK0177-143', recording_id='20200707_L_R001S07C01', start=0.0, duration=3.535, channel=0, text='对这也可以在咱们设计宣传页中体现出', language='Chinese', speaker='SPK0177', gender=None, custom=None, alignment=None)], features=Features(type='kaldi-fbank', num_frames=223855, num_features=80, frame_shift=0.01, sampling_rate=16000, start=0, duration=2238.554, storage_type='lilcom_chunky', storage_path='data/fbank/aishell4_feats_train_L/feats-5.lca', storage_key='797975384,46595,46375,46457,46408,46563,46683,46719,46616,45869,44638,45440,45598,45592,44962,45612,46175,44952,44938,44666,45372,45229,44912,44773,45222,44842,44967,45484,45312,45762,45720,46269,45907,46186,46074,46283,46580,45918,44998,45119,44809,44815,44615,44822,44484,44668,44980,45307,45602,45636,45669,45597,45419,45595,45568,45688,45555,46001,45410,45187,44985,46054,45766,44680,45160,45437,45731,46178,45777,46002,45430,45774,45508,45607,45505,45738,44946,44732,45281,45890,46158,45194,45484,45316,45576,44806,45299,45430,45505,45932,45477,44793,45140,44698,45305,45468,46065,45868,45849,45683,45864,45407,45634,46099,45804,46225,45586,45589,45845,45658,45585,44877,45860,45552,44743,45230,45123,46095,45979,46122,45948,45807,45357,45773,45241,44594,44507,45265,45872,45457,45102,45390,45317,45873,45779,45660,45487,44590,45481,46056,45564,45164,45000,45162,44766,44628,45742,45932,45514,45767,45822,45545,44907,44240,45270,45092,45141,45922,44126,45229,45343,44974,44018,44664,45718,44832,45345,44689,45270,45688,45906,44212,45336,45651,45752,44795,45129,44917,45324,45434,45298,44926,45122,44794,45252,45854,45871,44660,45290,44580,46171,44381,44333,44655,44384,45193,44942,44984,45365,45147,45831,45381,45753,44592,44494,45098,45170,45721,44971,44686,44869,45455,45011,46233,44870,45506,45455,45066,45717,44937,45364,45233,45239,45215,44724,44604,45446,44499,45024,44799,44636,44436,44524,44373,45059,44609,45225,44817,44926,44324,44811,44160,44394,45847,45564,45961,45409,45629,45762,45759,45483,45613,45045,44888,45412,45932,45129,46027,45235,44754,43787,44224,45244,45222,44772,45550,45506,45002,44973,44726,44237,44234,44513,44593,44835,44201,44281,43984,44225,44515,44884,44253,44481,43522,44864,45413,45525,45401,45233,45005,45100,44748,44313,44330,44643,44545,44631,44501,44515,43939,44776,45289,44528,44402,44203,44072,44230,44891,44757,43897,43421,43903,43765,44196,44277,44728,45135,45103,45543,45341,44757,44321,44378,43670,44565,44375,44968,45008,44743,44379,45684,45651,45533,44468,44779,44644,45846,44885,45056,45142,44884,44252,44656,45742,44937,45197,44820,45036,44552,44769,44576,44603,45867,45366,45072,45020,44681,44796,44957,44571,44825,44523,44798,45431,45534,45005,45764,44081,43906,43892,43376,44185,44806,44602,44452,44907,45514,45405,44902,45389,45084,44749,44436,45262,44642,45378,45248,45310,44934,45465,45322,45478,45482,45449,45105,45319,44841,44184,44618,44921,44671,45026,44705,44812,45221,45197,44454,44560,44411,45337,44555,45846,44964,44370,45302,45229,45607,44252,45233,44530,45164,44898,44741,44868,45233,45783,45827,45681,45903,45427,45427,45102,45116,45119,45126,44763,45568,45445,45235,44944,45834,45269,44512,45209,45243,45052,44781,44552,32881', recording_id='None', channels=0), recording=Recording(id='20200707_L_R001S07C01', sources=[AudioSource(type='file', channels=[0, 1, 2, 3, 4, 5, 6, 7], source='/ceph-ms/luomingshuang/codes/icefall-pruned-rnnt5-aishell4/egs/aishell4/ASR/download/aishell4/train_L/wav/20200707_L_R001S07C01.flac')], sampling_rate=16000, num_samples=35816864, duration=2238.554, transforms=None), custom=None),) kwargs={})
[extra info] When calling: MixedCut.load_features(args=(MixedCut(id='bb49ef5e-27c2-cb69-f539-a085aaf9755b', tracks=[MixTrack(cut=MonoCut(id='69666566-1e79-46b5-ae86-c73d26eda59d', start=1939.8653125, duration=3.535, channel=0, supervisions=[SupervisionSegment(id='20200707_L_R001S07C01-SPK0177-143', recording_id='20200707_L_R001S07C01', start=0.0, duration=3.535, channel=0, text='对这也可以在咱们设计宣传页中体现出', language='Chinese', speaker='SPK0177', gender=None, custom=None, alignment=None)], features=Features(type='kaldi-fbank', num_frames=223855, num_features=80, frame_shift=0.01, sampling_rate=16000, start=0, duration=2238.554, storage_type='lilcom_chunky', storage_path='data/fbank/aishell4_feats_train_L/feats-5.lca', storage_key='797975384,46595,46375,46457,46408,46563,46683,46719,46616,45869,44638,45440,45598,45592,44962,45612,46175,44952,44938,44666,45372,45229,44912,44773,45222,44842,44967,45484,45312,45762,45720,46269,45907,46186,46074,46283,46580,45918,44998,45119,44809,44815,44615,44822,44484,44668,44980,45307,45602,45636,45669,45597,45419,45595,45568,45688,45555,46001,45410,45187,44985,46054,45766,44680,45160,45437,45731,46178,45777,46002,45430,45774,45508,45607,45505,45738,44946,44732,45281,45890,46158,45194,45484,45316,45576,44806,45299,45430,45505,45932,45477,44793,45140,44698,45305,45468,46065,45868,45849,45683,45864,45407,45634,46099,45804,46225,45586,45589,45845,45658,45585,44877,45860,45552,44743,45230,45123,46095,45979,46122,45948,45807,45357,45773,45241,44594,44507,45265,45872,45457,45102,45390,45317,45873,45779,45660,45487,44590,45481,46056,45564,45164,45000,45162,44766,44628,45742,45932,45514,45767,45822,45545,44907,44240,45270,45092,45141,45922,44126,45229,45343,44974,44018,44664,45718,44832,45345,44689,45270,45688,45906,44212,45336,45651,45752,44795,45129,44917,45324,45434,45298,44926,45122,44794,45252,45854,45871,44660,45290,44580,46171,44381,44333,44655,44384,45193,44942,44984,45365,45147,45831,45381,45753,44592,44494,45098,45170,45721,44971,44686,44869,45455,45011,46233,44870,45506,45455,45066,45717,44937,45364,45233,45239,45215,44724,44604,45446,44499,45024,44799,44636,44436,44524,44373,45059,44609,45225,44817,44926,44324,44811,44160,44394,45847,45564,45961,45409,45629,45762,45759,45483,45613,45045,44888,45412,45932,45129,46027,45235,44754,43787,44224,45244,45222,44772,45550,45506,45002,44973,44726,44237,44234,44513,44593,44835,44201,44281,43984,44225,44515,44884,44253,44481,43522,44864,45413,45525,45401,45233,45005,45100,44748,44313,44330,44643,44545,44631,44501,44515,43939,44776,45289,44528,44402,44203,44072,44230,44891,44757,43897,43421,43903,43765,44196,44277,44728,45135,45103,45543,45341,44757,44321,44378,43670,44565,44375,44968,45008,44743,44379,45684,45651,45533,44468,44779,44644,45846,44885,45056,45142,44884,44252,44656,45742,44937,45197,44820,45036,44552,44769,44576,44603,45867,45366,45072,45020,44681,44796,44957,44571,44825,44523,44798,45431,45534,45005,45764,44081,43906,43892,43376,44185,44806,44602,44452,44907,45514,45405,44902,45389,45084,44749,44436,45262,44642,45378,45248,45310,44934,45465,45322,45478,45482,45449,45105,45319,44841,44184,44618,44921,44671,45026,44705,44812,45221,45197,44454,44560,44411,45337,44555,45846,44964,44370,45302,45229,45607,44252,45233,44530,45164,44898,44741,44868,45233,45783,45827,45681,45903,45427,45427,45102,45116,45119,45126,44763,45568,45445,45235,44944,45834,45269,44512,45209,45243,45052,44781,44552,32881', recording_id='None', channels=0), recording=Recording(id='20200707_L_R001S07C01', sources=[AudioSource(type='file', channels=[0, 1, 2, 3, 4, 5, 6, 7], source='/ceph-ms/luomingshuang/codes/icefall-pruned-rnnt5-aishell4/egs/aishell4/ASR/download/aishell4/train_L/wav/20200707_L_R001S07C01.flac')], sampling_rate=16000, num_samples=35816864, duration=2238.554, transforms=None), custom=None), offset=0.0, snr=None), MixTrack(cut=MonoCut(id='012955da-b07f-44d9-8cac-c4ce7cbf2a18', start=230.0, duration=3.57, channel=0, supervisions=[], features=Features(type='kaldi-fbank', num_frames=1000, num_features=80, frame_shift=0.01, sampling_rate=16000, start=230.0, duration=10.0, storage_type='lilcom_chunky', storage_path='data/fbank/musan_feats/feats-10.lca', storage_key='217603661,43072,44618', recording_id='None', channels=0), recording=Recording(id='speech-us-gov-0085', sources=[AudioSource(type='file', channels=[0], source='/ceph-ms/luomingshuang/codes/icefall-pruned-rnnt5-aishell4/egs/aishell4/ASR/download/musan/speech/us-gov/speech-us-gov-0085.wav')], sampling_rate=16000, num_samples=9599687, duration=599.9804375, transforms=None), custom=None), offset=0.0, snr=12.635423209346753), MixTrack(cut=PaddingCut(id='6aa0246b-72e3-7714-a912-aee6c560e6e4', duration=0.0, sampling_rate=16000, feat_value=-23.025850929940457, num_frames=0, num_features=80, frame_shift=0.01, num_samples=0, custom=None), offset=3.57, snr=None)]),) kwargs={})
csukuangfj commented 2 years ago

I confirm that the last two lines from the above logs can be executed successfully from an interactive terminal.

danpovey commented 2 years ago

@luomingshuang if you want just a quick fix for this, I think setting num_workers=0 in your asr_datamodule.py works. It's some kind of threading bug.

csukuangfj commented 2 years ago

Sorry I didn't get time to get back to this,

@desh2608

Do you keep using LilcomChunkyWriter and the issue disappears or do you just switch to LilcomChunk LilcomHdf5Writer and then the issue disappear?

csukuangfj commented 2 years ago

I have suggested @luomingshuang to use the changes in https://github.com/k2-fsa/icefall/discussions/391#discussioncomment-2885803

That is,

diff --git a/lhotse/features/io.py b/lhotse/features/io.py
index 8ddeed1..b139ae7 100644
--- a/lhotse/features/io.py
+++ b/lhotse/features/io.py
@@ -380,7 +380,8 @@ def lookup_cache_or_open_regular_file(storage_path: str):
     The file handles can be freed at any time by calling ``close_cached_file_handles()``.
     """
     f = open(storage_path, "rb")
-    return f
+    lock = threading.Lock()
+    return f, lock

 @lru_cache(maxsize=None)
@@ -737,8 +738,7 @@ class LilcomChunkyReader(FeaturesReader):

     def __init__(self, storage_path: Pathlike, *args, **kwargs):
         super().__init__()
-        self.file = lookup_cache_or_open_regular_file(storage_path)
-        self.lock = threading.Lock()
+        self.file, self.lock = lookup_cache_or_open_regular_file(storage_path)

     @dynamic_lru_cache
     def read(

It still does not help.

luomingshuang commented 2 years ago

I will use ChunkedLilcomHdf5Writer to compute fbank feature and use it to test.

pzelasko commented 2 years ago

Hmm I think the problem only appears when working with long-recording data, I was trying to repro on mini librispeech so that’s probably why it didn’t work. Let me take another look at it then.

desh2608 commented 2 years ago

Sorry I didn't get time to get back to this,

@desh2608

Do you keep using LilcomChunkyWriter and the issue disappears or do you just switch to LilcomChunk LilcomHdf5Writer and then the issue disappear?

I think I just extracted features again (still using LilcomChunkyWriter) and could get through successfully, so perhaps it was a corruption issue as Piotr had mentioned earlier.

danpovey commented 2 years ago

I don't think so. We had the problem here and verified that the features were not corrupted. I think it's some kind of threading bug but the fix we tried didn't work; we must have overlooked something.