k2-fsa / icefall

https://k2-fsa.github.io/icefall/
Apache License 2.0
797 stars 267 forks source link

json.decoder.JSONDecodeError,when I run wenetspeech prepare.sh #1572

Closed yuyun2000 closed 1 month ago

yuyun2000 commented 1 month ago

at stage 8,what should i can do?

2024-03-28 10:24:23 (prepare.sh:132:main) Stage 8: Compute features for S
2024-03-28 10:24:25,746 INFO [compute_fbank_wenetspeech_splits.py:208] {'training_subset': 'S', 'num_workers': 20, 'batch_duration': 600.0, 'num_splits': 1000, 'start': 0, 'stop': -1, 'num_mel_bins': 80, 'whisper_fbank': False, 'output_dir_prefix': ''}
2024-03-28 10:24:27,352 INFO [compute_fbank_wenetspeech_splits.py:146] device: cuda:0
2024-03-28 10:24:27,352 INFO [utils.py:87] The user overrided tolerance for audio duration mismatch between the values in the manifest and the actual data. Old threshold: 0.025s. New threshold: 0.01s.
/home/bits/anaconda3/envs/sherpa-torch2-py310/lib/python3.10/site-packages/lhotse/audio/utils.py:94: UserWarning: The audio duration mismatch tolerance has been set to a value lower than default (0.025s). We don't recommend this as it might break some data augmentation transforms.
  warnings.warn(
2024-03-28 10:24:27,353 INFO [compute_fbank_wenetspeech_splits.py:153] Processing 1/1000
2024-03-28 10:24:27,353 INFO [compute_fbank_wenetspeech_splits.py:165] Loading /home/bits/workspace/yun/icefall/egs/wenetspeech/ASR/data/fbank/S_split_1000/cuts_S_raw.0000.jsonl.gz
Traceback (most recent call last):
  File "/home/bits/workspace/yun/icefall/egs/wenetspeech/ASR/./local/compute_fbank_wenetspeech_splits.py", line 214, in <module>
    main()
  File "/home/bits/workspace/yun/icefall/egs/wenetspeech/ASR/./local/compute_fbank_wenetspeech_splits.py", line 210, in main
    compute_fbank_wenetspeech_splits(args)
  File "/home/bits/workspace/yun/icefall/egs/wenetspeech/ASR/./local/compute_fbank_wenetspeech_splits.py", line 166, in compute_fbank_wenetspeech_splits
    cut_set = CutSet.from_file(raw_cuts_path)
  File "/home/bits/anaconda3/envs/sherpa-torch2-py310/lib/python3.10/site-packages/lhotse/serialization.py", line 529, in from_file
    return load_manifest_lazy_or_eager(path, manifest_cls=cls)
  File "/home/bits/anaconda3/envs/sherpa-torch2-py310/lib/python3.10/site-packages/lhotse/serialization.py", line 481, in load_manifest_lazy_or_eager
    return load_manifest_lazy(path)
  File "/home/bits/anaconda3/envs/sherpa-torch2-py310/lib/python3.10/site-packages/lhotse/serialization.py", line 465, in load_manifest_lazy
    first = next(raw_data)
  File "/home/bits/anaconda3/envs/sherpa-torch2-py310/lib/python3.10/site-packages/lhotse/serialization.py", line 134, in load_jsonl
    ret = decode_json_line(line)
  File "/home/bits/anaconda3/envs/sherpa-torch2-py310/lib/python3.10/site-packages/lhotse/serialization.py", line 608, in decode_json_line
    return json.loads(line)
  File "/home/bits/anaconda3/envs/sherpa-torch2-py310/lib/python3.10/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/home/bits/anaconda3/envs/sherpa-torch2-py310/lib/python3.10/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/home/bits/anaconda3/envs/sherpa-torch2-py310/lib/python3.10/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
yuyun2000 commented 1 month ago

update your lhotse and run again