'NoneType' object is not iterable

zhihao-chen commented 1 year ago

2023-07-21 15:31:33 (prepare.sh:69:main) Stage 2: Tokenize/Fbank aishell ['train', '', 'dev', '', 'test'] {'train': {'recordings': CutSet(len=0) [underlying data type: <class 'dict'>], 'supervisions': CutSet(len=0) [underlying data type: <class 'dict'>]}, 'dev': {'recordings': CutSet(len=0) [underlying data type: <class 'dict'>], 'supervisions': CutSet(len=0) [underlying data type: <class 'dict'>]}, 'test': {'recordings': CutSet(len=0) [underlying data type: <class 'dict'>], 'supervisions': CutSet(len=0) [underlying data type: <class 'dict'>]}} 2023-07-21 15:31:39,866 INFO [tokenizer.py:154] dataset_parts: ['train', '', 'dev', '', 'test'] manifests 3 2023-07-21 15:31:39,870 INFO [tokenizer.py:161] Processing partition: train CUDA: True {'recordings': CutSet(len=0) [underlying data type: <class 'dict'>], 'supervisions': CutSet(len=0) [underlying data type: <class 'dict'>]} CutSet(len=0) [underlying data type: <class 'dict'>] CutSet(len=0) [underlying data type: <class 'dict'>] Computing features in batches: 0it [00:00, ?it/s] None 0it [00:00, ?it/s] Traceback (most recent call last): File "/home/aiteam/work2/chenzhihao/vall-e/egs/aishell1/bin/tokenizer.py", line 266, in main() File "/home/aiteam/work2/chenzhihao/vall-e/egs/aishell1/bin/tokenizer.py", line 230, in main for c in tqdm(cut_set): File "/home/aiteam/anaconda3/envs/vall-e/lib/python3.10/site-packages/tqdm/std.py", line 1178, in iter for obj in iterable: TypeError: 'NoneType' object is not iterable

打印中间结果，发现manifest，cutset都是空。但manifest目录下又是有东西的呀，所以这是哪一步出错了

525qqqqq commented 1 year ago

我也遇到了同样的问题

lifeiteng commented 1 year ago

数据没下载+解压缩成功，确保 lhotse download + prepare 都执行成功了（部分原因是国内网络的事情，看 Lhotse 逻辑吧 https://github.com/lhotse-speech/lhotse/blob/master/lhotse/recipes/aishell.py#L58）。

这个问题重复太多次了。

525qqqqq commented 1 year ago

我按照要求下载好了，并且解压后也按照要求要求存放文件了，但是还是出现了这个问题。而且打开manifest里面的json文件里是空的

lifeiteng commented 1 year ago

@525qqqqq https://github.com/lhotse-speech/lhotse/blob/master/lhotse/recipes/aishell.py#L58

    for tar_name in [dataset_tar_name, resources_tar_name]:
        tar_path = target_dir / tar_name
        extracted_dir = corpus_dir / tar_name[:-4]
        completed_detector = extracted_dir / ".completed"
        if completed_detector.is_file():
            logging.info(
                f"Skipping download of {tar_name} because {completed_detector} exists."
            )
            continue
        resumable_download(
            f"{url}/{tar_name}", filename=tar_path, force_download=force_download
        )
        shutil.rmtree(extracted_dir, ignore_errors=True)
        with tarfile.open(tar_path) as tar:
            safe_extract(tar, path=corpus_dir)
        if tar_name == dataset_tar_name:
            wav_dir = extracted_dir / "wav"
            for sub_tar_name in os.listdir(wav_dir):
                with tarfile.open(wav_dir / sub_tar_name) as tar:
                    safe_extract(tar, path=wav_dir)
        completed_detector.touch()

lifeiteng / vall-e

'NoneType' object is not iterable #153