open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
https://openhlt.github.io/amphion/
MIT License
4.28k stars 364 forks source link

[Help]: NS2 data preprocess problem #213

Closed CreepJoye closed 3 weeks ago

CreepJoye commented 1 month ago
if cfg.preprocess.extract_duration:
            durations, phones, start, end = duration.get_duration(
                utt, wav, cfg.preprocess
            )
            save_feature(dataset_output, cfg.preprocess.duration_dir, uid, durations)
            save_txt(dataset_output, cfg.preprocess.lab_dir, uid, phones)
            wav = wav[start:end].astype(np.float32)
            wav_torch = torch.from_numpy(wav).to(wav_torch.device)

def get_duration(utt, wav, cfg):
    speaker = utt["Singer"]
    basename = utt["Uid"]
    dataset = utt["Dataset"]
    sample_rate = cfg["sample_rate"]

    # print(cfg.processed_dir, dataset, speaker, basename)
    wav_path = os.path.join(
        cfg.processed_dir, dataset, "raw_data", speaker, "{}.wav".format(basename)
    )
    text_path = os.path.join(
        cfg.processed_dir, dataset, "raw_data", speaker, "{}.lab".format(basename)
    )
    tg_path = os.path.join(
        cfg.processed_dir, dataset, "TextGrid", speaker, "{}.TextGrid".format(basename)
    )

It seems that we should use "{}.lab" and {}.TextGrid files,but I only have raw data and have no way to get them,how can I solve this?