neosapience / mlp-singer

Official implementation of MLP Singer: Towards Rapid Parallel Korean Singing Voice Synthesis (IEEE MLSP 2021)
MIT License
117 stars 28 forks source link

ValueError: '' is not in list #12

Closed halfmony closed 1 year ago

halfmony commented 1 year ago

Hello. I am a beginner. I tried to test this project using CSD. I get the following error.

Traceback (most recent call last):
  File "/home/studio-lab-user/.conda/envs/mlp/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/studio-lab-user/.conda/envs/mlp/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/studio-lab-user/mlp-singer/data/serialize.py", line 53, in <module>
    main(args)
  File "/home/studio-lab-user/mlp-singer/data/serialize.py", line 40, in main
    result = preprocessor(midi_path, text_path, wav_path)
  File "/home/studio-lab-user/mlp-singer/data/preprocess.py", line 32, in __call__
    text = get_phonemes(text_path)
  File "/home/studio-lab-user/mlp-singer/data/preprocess.py", line 61, in get_phonemes
    encoded = encode(line)
  File "/home/studio-lab-user/mlp-singer/data/g2p.py", line 173, in encode
    encoded_prono[-2].coda = RCD.index(p) + len(ONS) + len(NUC)
ValueError: '' is not in list

I built the environment in amazon SageMaker Studio Lab using the following steps. (I followed these steps to resolve other errors.)

conda create -n mlp python=3.8
conda activate mlp
pip install -r requirements.txt
  Requirements.txt has been changed as follows.
  librosa==0.7.0
  mido==1.2.9
  numpy==1.20.2
  scipy==1.6.2
  tensorboard==2.4.0
  torch==1.7.0
  torchaudio==0.7.0
pip install tqdm
pip install -U numba==0.48
pip install -U resampy==0.2.2

Can you find a solution?

jaketae commented 1 year ago

Hello @halfmony, thanks for opening this issue. The issue seems to come from the grapheme-to-phoneme module, which I directly borrowed from BEGANSing.

I can't say that I have a solid grasp of the code, but from the looks of it, perhaps RCD should be replaced with COD in mlp-singer/data/g2p.py.

def encode(graph):
    # rest omitted    
        elif p in COD:
            # original: encoded_prono[-2].coda = RCD.index(p) + len(ONS) + len(NUC)
            encoded_prono[-2].coda = COD.index(p) + len(ONS) + len(NUC)
    return encoded_prono[:-1]

def decode(encoded_prono):
        # rest omitted
        if p.coda is not None:
            # original: phone += RCD[p.coda - (len(ONS) + len(NUC))]
            phone += COD[p.coda - (len(ONS) + len(NUC))]
        prono.append(phone)
    return prono

If this doesn't work, I suggest that you try opening an issue in the BEGANSing upstream repo. Thanks!

halfmony commented 1 year ago

I had made an elementary mistake. I should have renamed the lyric directory in the Children Song Dataset to txt and used that. However, I was trying to use the txt directory in the Children Song Dataset as is. I am very sorry for the trouble my elementary mistake has caused you and thank you for taking care of it.

AbdullahMustafa040 commented 1 year ago

@jaketae how did you resolve that?