Aria-K-Alethia / laughter-synthesis

Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" accepted by INTERSPEECH 2023.
MIT License
70 stars 5 forks source link

Missing files #4

Open lareina-a opened 6 months ago

lareina-a commented 6 months ago

Hello,I am a student,can you share ./codes/laughter_200.txt.I want to train a model based on the code you shared.

Aria-K-Alethia commented 4 months ago

Hi, You may follow the data preprocessing instructions written in README to get this file.

mediadream commented 3 months ago

Collected 0 files 0it [00:00, ?it/s] Error executing job with overrides: ['preprocess=laughter', 'preprocess.path.laughter.path=/content/laughter-synthesis/ver1.0/denoised'] Traceback (most recent call last): File "/content/laughter-synthesis/preprocess.py", line 58, in preprocess write_filelist(kmeans_files, cfg.view.kmeans_filelist) File "/content/laughter-synthesis/utils.py", line 22, in write_filelist with open(path, 'w', encoding='utf8') as f: FileNotFoundError: [Errno 2] No such file or directory: './filelists/laughter_kmeans.txt'


The same issue. I've followed the instructions (Setup) outlined in the README, but I'm getting the above error message upon executing the script in Google Colab: !python3 preprocess.py hydra.output_subdir=null hydra.job.chdir=False preprocess=laughter preprocess.path.laughter.path="/content/laughter-synthesis/ver1.0/denoised"

Based on this error, it appears that the script is referencing a file named laughter_kmeans.txt located within the ./filelists directory under the path /content/laughter-synthesis/config/preprocess/laughter.yaml. However, there seems to be no such directory or file.

Here are my observations:

  1. Missing "filelists" Directory: Possibility: The filelists directory might not be included in the initial repo.

  2. Missing "ckpt" and "codes" Directories: The laughter.yaml configuration file also references directories named "ckpt" and "codes." These could be: Pre-existing directories containing checkpoint files and code for the laughter synthesis.

Can you explain how to get the "filelists", "ckpt" and "codes"?

Aria-K-Alethia commented 3 months ago

Collected 0 files 0it [00:00, ?it/s] Error executing job with overrides: ['preprocess=laughter', 'preprocess.path.laughter.path=/content/laughter-synthesis/ver1.0/denoised'] Traceback (most recent call last): File "/content/laughter-synthesis/preprocess.py", line 58, in preprocess write_filelist(kmeans_files, cfg.view.kmeans_filelist) File "/content/laughter-synthesis/utils.py", line 22, in write_filelist with open(path, 'w', encoding='utf8') as f: FileNotFoundError: [Errno 2] No such file or directory: './filelists/laughter_kmeans.txt'

……

Can you explain how to get the "filelists", "ckpt" and "codes"?

You can intilize them as empty directories by yourself.

mediadream commented 3 months ago

Collected 0 files 0it [00:00, ?it/s] Error executing job with overrides: ['preprocess=laughter', 'preprocess.path.laughter.path=/content/laughter-synthesis/ver1.0/denoised'] Traceback (most recent call last): File "/content/laughter-synthesis/preprocess.py", line 58, in preprocess write_filelist(kmeans_files, cfg.view.kmeans_filelist) File "/content/laughter-synthesis/utils.py", line 22, in write_filelist with open(path, 'w', encoding='utf8') as f: FileNotFoundError: [Errno 2] No such file or directory: './filelists/laughter_kmeans.txt'

……

Can you explain how to get the "filelists", "ckpt" and "codes"?

You can intilize them as empty directories by yourself.

Thank you for your comment! I've just added an empty "filelists" and "codes" (this one with an empty file named laughter_200.txt because of the following error: FileNotFoundError: [Errno 2] No such file or directory: './codes/laughter_200.txt') directories but this time I got the message below which means the dataset was not created properly.

Can you explain the format of laughter_200.txt or how to create the dataset from the provided corpus? (For now, I also got the error message: RuntimeError: value cannot be converted to type float without overflow. when excute the command !python3 train.py preprocess=laughter dataset=laughter)


Collected 0 files 0it [00:00, ?it/s] CUDA_VISIBLE_DEVICES=1 python3 speech2unit.py --train-filelist ./filelists/laughter_kmeans.txt --nclusters 200 --feature-type hubert --model-path facebook/hubert-base-ls960 --layer 12 --test-filelist ./filelists/laughter_kmeans.txt --kmeans-path ./ckpt/laughter_kmeans_200.model --code-path ./codes/laughter_200.txt --pretrained-kmeans ./ckpt/laughter_kmeans_200.model Dump code and duration 0it [00:00, ?it/s] Dump speaker dict to ./data/laughter/speaker.pt total speaker number: 0 Dump acoustic features Process 0 under ./data/laughter to get mel, pitch, energy and duration Process the utterances 0it [00:00, ?it/s] Done, got 0 results, begin to collect statistics Done, 0 failed, failed list: [] 0 fid succeeded Pitch normalization, mean: {}, std: {} Pitch range: [179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000, -179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000] Energy normalization, mean: {}, std: {} Energy range: [179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000, -179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000] Save statistics to ./data/laughter/stats.pt Process done, total time: 0.00 hours preprocess done, before: 0, after: 0 [] train: 0, val: 0, test: 0


Aria-K-Alethia commented 3 months ago

Collected 0 files 0it [00:00, ?it/s] Error executing job with overrides: ['preprocess=laughter', 'preprocess.path.laughter.path=/content/laughter-synthesis/ver1.0/denoised'] Traceback (most recent call last): File "/content/laughter-synthesis/preprocess.py", line 58, in preprocess write_filelist(kmeans_files, cfg.view.kmeans_filelist) File "/content/laughter-synthesis/utils.py", line 22, in write_filelist with open(path, 'w', encoding='utf8') as f: FileNotFoundError: [Errno 2] No such file or directory: './filelists/laughter_kmeans.txt'

……

Can you explain how to get the "filelists", "ckpt" and "codes"?

You can intilize them as empty directories by yourself.

Thank you for your comment! I've just added an empty "filelists" and "codes" (this one with an empty file named laughter_200.txt because of the following error: FileNotFoundError: [Errno 2] No such file or directory: './codes/laughter_200.txt') directories but this time I got the message below which means the dataset was not created properly.

Can you explain the format of laughter_200.txt or how to create the dataset from the provided corpus? (For now, I also got the error message: RuntimeError: value cannot be converted to type float without overflow. when excute the command !python3 train.py preprocess=laughter dataset=laughter)

Collected 0 files 0it [00:00, ?it/s] CUDA_VISIBLE_DEVICES=1 python3 speech2unit.py --train-filelist ./filelists/laughter_kmeans.txt --nclusters 200 --feature-type hubert --model-path facebook/hubert-base-ls960 --layer 12 --test-filelist ./filelists/laughter_kmeans.txt --kmeans-path ./ckpt/laughter_kmeans_200.model --code-path ./codes/laughter_200.txt --pretrained-kmeans ./ckpt/laughter_kmeans_200.model Dump code and duration 0it [00:00, ?it/s] Dump speaker dict to ./data/laughter/speaker.pt total speaker number: 0 Dump acoustic features Process 0 under ./data/laughter to get mel, pitch, energy and duration Process the utterances 0it [00:00, ?it/s] Done, got 0 results, begin to collect statistics Done, 0 failed, failed list: [] 0 fid succeeded Pitch normalization, mean: {}, std: {} Pitch range: [179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000, -179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000] Energy normalization, mean: {}, std: {} Energy range: [179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000, -179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000] Save statistics to ./data/laughter/stats.pt Process done, total time: 0.00 hours preprocess done, before: 0, after: 0 [] train: 0, val: 0, test: 0

I don't fully understand the error message since I never encountered it before. But at least laughter_200.txt should not be created by yourself (you only need to make directories). laughter_200.txt is created by

CUDA_VISIBLE_DEVICES=1 python3 speech2unit.py --train-filelist ./filelists/laughter_kmeans.txt --nclusters 200 --feature-type hubert --model-path facebook/hubert-base-ls960 --layer 12 --test-filelist ./filelists/laughter_kmeans.txt --kmeans-path ./ckpt/laughter_kmeans_200.model --code-path ./codes/laughter_200.txt --pretrained-kmeans

You may also check speech2unit.py to see if this cmds didn't work correctly.

mediadream commented 3 months ago

Collected 0 files 0it [00:00, ?it/s] Error executing job with overrides: ['preprocess=laughter', 'preprocess.path.laughter.path=/content/laughter-synthesis/ver1.0/denoised'] Traceback (most recent call last): File "/content/laughter-synthesis/preprocess.py", line 58, in preprocess write_filelist(kmeans_files, cfg.view.kmeans_filelist) File "/content/laughter-synthesis/utils.py", line 22, in write_filelist with open(path, 'w', encoding='utf8') as f: FileNotFoundError: [Errno 2] No such file or directory: './filelists/laughter_kmeans.txt'

……

Can you explain how to get the "filelists", "ckpt" and "codes"?

You can intilize them as empty directories by yourself.

Thank you for your comment! I've just added an empty "filelists" and "codes" (this one with an empty file named laughter_200.txt because of the following error: FileNotFoundError: [Errno 2] No such file or directory: './codes/laughter_200.txt') directories but this time I got the message below which means the dataset was not created properly. Can you explain the format of laughter_200.txt or how to create the dataset from the provided corpus? (For now, I also got the error message: RuntimeError: value cannot be converted to type float without overflow. when excute the command !python3 train.py preprocess=laughter dataset=laughter) Collected 0 files 0it [00:00, ?it/s] CUDA_VISIBLE_DEVICES=1 python3 speech2unit.py --train-filelist ./filelists/laughter_kmeans.txt --nclusters 200 --feature-type hubert --model-path facebook/hubert-base-ls960 --layer 12 --test-filelist ./filelists/laughter_kmeans.txt --kmeans-path ./ckpt/laughter_kmeans_200.model --code-path ./codes/laughter_200.txt --pretrained-kmeans ./ckpt/laughter_kmeans_200.model Dump code and duration 0it [00:00, ?it/s] Dump speaker dict to ./data/laughter/speaker.pt total speaker number: 0 Dump acoustic features Process 0 under ./data/laughter to get mel, pitch, energy and duration Process the utterances 0it [00:00, ?it/s] Done, got 0 results, begin to collect statistics Done, 0 failed, failed list: [] 0 fid succeeded Pitch normalization, mean: {}, std: {} Pitch range: [179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000, -179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000] Energy normalization, mean: {}, std: {} Energy range: [179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000, -179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000] Save statistics to ./data/laughter/stats.pt Process done, total time: 0.00 hours preprocess done, before: 0, after: 0 [] train: 0, val: 0, test: 0

I don't fully understand the error message since I never encountered it before. But at least laughter_200.txt should not be created by yourself (you only need to make directories). laughter_200.txt is created by

CUDA_VISIBLE_DEVICES=1 python3 speech2unit.py --train-filelist ./filelists/laughter_kmeans.txt --nclusters 200 --feature-type hubert --model-path facebook/hubert-base-ls960 --layer 12 --test-filelist ./filelists/laughter_kmeans.txt --kmeans-path ./ckpt/laughter_kmeans_200.model --code-path ./codes/laughter_200.txt --pretrained-kmeans

You may also check speech2unit.py to see if this cmds didn't work correctly.

Thank you for the script! All I need was to create a list (txt file) of .wav files. Done with preprocess.py!

Now, I'm facing the following error when execute the train srcipt (python train.py preprocess=laughter dataset=laughter): Error executing job with overrides: ['preprocess=laughter', 'dataset=laughter']

Traceback (most recent call last): File "train.py", line 40, in train lightning_module = BaselineLightningModule(cfg) File "/content/laughter-synthesis/lightning_module.py", line 37, in init self.construct_model() File "/content/laughter-synthesis/lightning_module.py", line 44, in construct_model self.vocoder = utils.get_vocoder_16k(self.cfg.model.vocoder.model, join(self.ocwd, self.cfg.model.vocoder.path)) File "/content/laughter-synthesis/utils.py", line 167, in get_vocoder_16k ckpt = torch.load(join(path, "g_16k_320hop"), map_location=lambda s, l: s) File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 571, in load with _open_file_like(f, 'rb') as opened_file: File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 229, in _open_file_like return _open_file(name_or_buffer, mode) File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 210, in init super(_open_file, self).init(open(name, mode))

No such file or directory: '/content/laughter-synthesis/hifigan/g_16k_320hop

I have downloaded the vocoder (g_16k_320hop.zip) and placed in the "hifigan" directory.

Can you load this zip file using torch load?? (File "/content/laughter-synthesis/utils.py", line 167, in get_vocoder_16k ckpt = torch.load(join(path, "g_16k_320hop"), map_location=lambda s, l: s))

Even if I change the python version (3.10 -> 3.6) and torch version (1.6.0) to be able to load the zip file, I got the above error message.

Which version do you use when execute the train script? Can you provide the specific version of packages and python? (Insufficient information in the requirements.txt)

mediadream commented 3 months ago

I needed to change the file name to "g_16k_320hop.zip" and worked!

After run the sample.py script ( !python sample.py data-bin/ \ --path=ckpts/checkpoint_best.pt --task=language_modeling --sampling --temperature=0.7 \ --seed=12345678 --prompts=data/prompt.txt --output=data/sample.txt --max-len-a=0 --max-len-b=500 \ --batch-size=1 --fp16 --samples-per-prompt=1),

I got the output ("data/sample.txt") tokens, but no audio files. How to convert the output result to audio files?

(I put some sample lines in the "data/prompt.txt", but I don't know if this is the correct format:

1 | laughter 2 | laughter2

)