festvox / festival

Festival Speech Synthesis System
Other
386 stars 58 forks source link

Problem in building clustergen indic voice #13

Closed skmalviya closed 4 years ago

skmalviya commented 6 years ago

Following the link "http://festvox.org/bsv/x3528.html", tried to built hindi tts from scratch on sample of100 'hindi' wav files ['hindi_0001.wav' - 'hindi_0102.wav'] obtained from 'cmu_indic_hin_ab.tar.bz2'.

Every script works fine upto the following script-- ./bin/do_clustergen parallel cluster etc/txt.done.data.train

getting lots of 'file not found' & 'Segmentation fault (core dumped)' Errors

example------------ final lines of above command:

Segmentation fault (core dumped) Dataset of 0 vectors of 64 parameters from: festival/feats/9r_1.feats Attempt to access channel 1 of 0 channel track Dataset of 0 vectors of 64 parameters from: festival/feats/9r_1.feats Segmentation fault (core dumped) Segmentation fault (core dumped) Dataset of 0 vectors of 64 parameters from: festival/feats/9r=_1.feats Attempt to access channel 1 of 0 channel track Dataset of 0 vectors of 64 parameters from: festival/feats/9r=_1.feats Dataset of 0 vectors of 64 parameters from: festival/feats/9r_1.feats Attempt to access channel 1 of 0 channel track Dataset of 0 vectors of 64 parameters from: festival/feats/9r_1.feats Segmentation fault (core dumped) Dataset of 0 vectors of 64 parameters from: festival/feats/9r=_1.feats Attempt to access channel 1 of 0 channel track Dataset of 0 vectors of 64 parameters from: festival/feats/9r=_1.feats Segmentation fault (core dumped) Dataset of 0 vectors of 64 parameters from: festival/feats/9r_1.feats Attempt to access channel 1 of 0 channel track Dataset of 0 vectors of 64 parameters from: festival/feats/9r_1.feats Segmentation fault (core dumped) Segmentation fault (core dumped) Dataset of 0 vectors of 64 parameters from: festival/feats/9r=_1.feats Attempt to access channel 1 of 0 channel track Dataset of 0 vectors of 64 parameters from: festival/feats/9r=_1.feats Dataset of 0 vectors of 64 parameters from: festival/feats/9r_1.feats Attempt to access channel 1 of 0 channel track Dataset of 0 vectors of 64 parameters from: festival/feats/9r_1.feats Segmentation fault (core dumped) Dataset of 0 vectors of 64 parameters from: festival/feats/9r=_1.feats Attempt to access channel 1 of 0 channel track Dataset of 0 vectors of 64 parameters from: festival/feats/9r=_1.feats Segmentation fault (core dumped) Dataset of 0 vectors of 64 parameters from: festival/feats/9r_1.feats Attempt to access channel 1 of 0 channel track Dataset of 0 vectors of 64 parameters from: festival/feats/9r_1.feats Segmentation fault (core dumped) Dataset of 0 vectors of 64 parameters from: festival/feats/9r=_1.feats Attempt to access channel 1 of 0 channel track Dataset of 0 vectors of 64 parameters from: festival/feats/9r=_1.feats Segmentation fault (core dumped) Segmentation fault (core dumped) Dataset of 0 vectors of 64 parameters from: festival/feats/9r_1.feats Attempt to access channel 1 of 0 channel track Dataset of 0 vectors of 64 parameters from: festival/feats/9r_1.feats Segmentation fault (core dumped) Segmentation fault (core dumped) Segmentation fault (core dumped) Segmentation fault (core dumped) Segmentation fault (core dumped) Segmentation fault (core dumped) Segmentation fault (core dumped) Collect trees SIOD ERROR: wrong type of argument to setcar BACKTRACE: 0: (set-car! (car tree) vector_num) 1: (clustergen::dump_tree_vectors tree rawtrackfd) 2: (set! tree (clustergen::dump_tree_vectors tree rawtrackfd)) 3: (f (car l2)) 4: (cons (f (car l2)) r) 5: (set! r (cons (f (car l2)) r)) 6: (while l2 (set! r (cons (f (car l2)) r)) (set! l2 (cdr l2))) 7: (mapcar (lambda (unit) (...) ...) unittypes) 8: (if (consp cg:multimodel) (mapcar (...) cg:multimodel) ...) 9: (begin (set! cg:parallel_tree_build t) (build_clustergen "etc/txt.done.data.train")) closing a file left open: festival/trees/cmu_indic_ss_mcep.rawparams closing a file left open: festival/trees/cmu_indic_ss_mcep.tree

Please tell the solution where I am doing wrong.

Note:- I have build all the required tools as mentioned in 'fest_build' script.

saikrishnarallabandi commented 6 years ago

Hi,

This looks like an issue with labeling or utts creation.

Do you have a log of the previous steps so that I can better figure this out?

Also the steps you pointed out are a bit old. I can point you at the latest set of steps.

saikrishnarallabandi commented 6 years ago

Here is the series of steps to build a decent voice assuming Festival, Festvox, speechtools and SPTK are installed (let me know if this is an issue):

Setup the directory structure

$FESTVOXDIR/src/clustergen/setup_cg cmu indic hin ab #(assuming name of the voice is ab)

(16 July 2020 so so so sorry I entered this wrong. The command should be as follows: $FESTVOXDIR/src/clustergen/setup_cg_indic cmu indic hin ab #(assuming name of the voice is ab)

setup_cg_indic NOT setup_cg sorry for the inconvenience)

Copy the wavefiles and prompts

./bin/get_wavs ${LOCATION}/*.wav cp ${LOCATION}/txt.done.data etc/txt.done.data

Some Text Processing

./bin/do_build build_prompts etc/txt.done.data ./bin/do_build label etc/txt.done.data ./bin/do_clustergen parallel build_utts etc/txt.done.data ./bin/do_clustergen generate_statenames etc/txt.done.data ./bin/do_clustergen generate_filters etc/txt.done.data

Feature Extraction

./bin/do_clustergen parallel f0_v_sptk etc/txt.done.data ./bin/do_clustergen parallel mcep_sptk etc/txt.done.data ./bin/do_clustergen parallel str_sptk etc/txt.done.data # Strengths of excitation

Combining the features for Machine Learning

mv festvox/clustergen.scm festvox/clustergen.scm.xxx cat festvox/clustergen.scm.xxx | sed 's/mixed_excitation nil/mixed_excitation t/' | cat >festvox/clustergen.scm ./bin/do_clustergen parallel combine_coeffs_me etc/txt.done.data

Separate train and test splits

./bin/traintest etc/txt.done.data

Training

./bin/do_clustergen parallel cluster etc/txt.done.data.train ./bin/do_clustergen dur etc/txt.done.data.train

Testing

./bin/do_clustergen cg_test resynth cgp etc/txt.done.data.test ./bin/do_clustergen cg_test tts tts etc/txt.done.data.test

skmalviya commented 6 years ago

Thanks for the such detailed help.

I followed the latest set of steps suggested. Got stuck with the following issues on the similar sample of 100 wav files from 'cmu_indic_hin_ab.tar.bz2'.:

######################## Issue 1############################ ./bin/do_build build_prompts etc/txt.done.data SIOD ERROR: could not open file ./festvox/language_variant.scm closing a file left open: ./festvox/indic_lexicon.scm closing a file left open: ./festvox/cmu_indic_hin_lexicon.scm closing a file left open: festvox/cmu_indic_hin_clunits.scm closing a file left open: festvox/build_clunits.scm (Note:- This was there before too in the earlier scripts. But I had resolved it by putting a file named "language_variant.scm" with content 'hin' in 'cmu_indic_hin_ab/festvox/' directory.

######################## Issue 2############################ ./bin/do_clustergen parallel str_stpk etc/txt.done.data # Strengths of excitation do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker.31140.4 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker.31139.3 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker.31137.1 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker.31141.5 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker.31138.2 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker.31136.0 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker.31142.6 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker.31155.7 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker.31173.10 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker.31171.9 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker.31177.11 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker.31186.13 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker.31168.8 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker.31183.12 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker.31187.14 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker.31204.15

######################## Issue 3############################ ./bin/traintest etc/txt.done.data

hindi_0003 COMBINE_COEFFS (f0,mcep_deltas,str,v) hindi_0002 COMBINE_COEFFS (f0,mcep_deltas,str,v) hindi_0004 COMBINE_COEFFS (f0,mcep_deltas,str,v) hindi_0016 COMBINE_COEFFS (f0,mcep_deltas,str,v) hindi_0001 COMBINE_COEFFS (f0,mcep_deltas,str,v) hindi_0005 COMBINE_COEFFS (f0,mcep_deltas,str,v) hindi_0006 COMBINE_COEFFS (f0,mcep_deltas,str,v) hindi_0007 COMBINE_COEFFS (f0,mcep_deltas,str,v) cat: str/hindi_0001.str: No such file or directory cat: str/hindi_0002.str: No such file or directory cat: str/hindi_0004.str: No such file or directory cat: str/hindi_0006.str: No such file or directory cat: str/hindi_0005.str: No such file or directory cat: str/hindi_0007.str: No such file or directory cat: str/hindi_0016.str: No such file or directory cat: str/hindi_0003.str: No such file or directory .............. issue3.txt

######################## Issue 4############################ ./bin/do_clustergen cg_test resynth cgp etc/txt.done.data.test Error reading ESPS file /home/shrikant/festival_hindi_tts/indic/cmu_indic_hin_ab//festival/trees/cmu_indic_hin_mcep.params Cannot load track: /home/shrikant/festival_hindi_tts/indic/cmu_indic_hin_ab//festival/trees/cmu_indic_hin_mcep.params SIOD ERROR: could not open file /home/shrikant/festival_hindi_tts/indic/cmu_indic_hin_ab//festival/trees/cmu_indic_hin_mcep.tree awk: cmd. line:2: fatal: division by zero attempted awk: cmd. line:2: fatal: division by zero attempted awk: cmd. line:2: fatal: division by zero attempted awk: cmd. line:2: fatal: division by zero attempted

Not gone further after this much of error......

saikrishnarallabandi commented 6 years ago

Seems to be an issue with versions.

What is the version of festival you are using?

And sptk

On 19 Aug 2018 5:32 a.m., "shrikant6153" notifications@github.com wrote:

Thanks for the such detailed help.

I followed the latest set of steps suggested. Got stuck with the following issues on the similar sample of 100 wav files from 'cmu_indic_hin_ab.tar.bz2'.:

######################## Issue 1############################ ./bin/do_build build_prompts etc/txt.done.data SIOD ERROR: could not open file ./festvox/language_variant.scm closing a file left open: ./festvox/indic_lexicon.scm closing a file left open: ./festvox/cmu_indic_hin_lexicon.scm closing a file left open: festvox/cmu_indic_hin_clunits.scm closing a file left open: festvox/build_clunits.scm (Note:- This was there before too in the earlier scripts. But I had resolved it by putting a file named "language_variant.scm" with content 'hin' in 'cmu_indic_hin_ab/festvox/' directory.

######################## Issue 2############################ ./bin/do_clustergen parallel str_stpk etc/txt.done.data # Strengths of excitation do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker. 31140.4 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker. 31139.3 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker. 31137.1 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker. 31141.5 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker. 31138.2 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker. 31136.0 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker. 31142.6 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker. 31155.7 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker. 31173.10 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker. 31171.9 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker. 31177.11 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker. 31186.13 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker. 31168.8 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker. 31183.12 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker. 31187.14 do_clustergen: unknown options str_stpk tmpdir/dobuild_parallelworker. 31204.15

######################## Issue 3############################ ./bin/traintest etc/txt.done.data

hindi_0003 COMBINE_COEFFS (f0,mcep_deltas,str,v) hindi_0002 COMBINE_COEFFS (f0,mcep_deltas,str,v) hindi_0004 COMBINE_COEFFS (f0,mcep_deltas,str,v) hindi_0016 COMBINE_COEFFS (f0,mcep_deltas,str,v) hindi_0001 COMBINE_COEFFS (f0,mcep_deltas,str,v) hindi_0005 COMBINE_COEFFS (f0,mcep_deltas,str,v) hindi_0006 COMBINE_COEFFS (f0,mcep_deltas,str,v) hindi_0007 COMBINE_COEFFS (f0,mcep_deltas,str,v) cat: str/hindi_0001.str: No such file or directory cat: str/hindi_0002.str: No such file or directory cat: str/hindi_0004.str: No such file or directory cat: str/hindi_0006.str: No such file or directory cat: str/hindi_0005.str: No such file or directory cat: str/hindi_0007.str: No such file or directory cat: str/hindi_0016.str: No such file or directory cat: str/hindi_0003.str: No such file or directory .............. issue3.txt https://github.com/festvox/festival/files/2300204/issue3.txt

######################## Issue 4############################ ./bin/do_clustergen cg_test resynth cgp etc/txt.done.data.test Error reading ESPS file /home/shrikant/festivalhindi tts/indic/cmu_indic_hin_ab//festival/trees/cmu_indic_hin_mcep.params Cannot load track: /home/shrikant/festivalhindi tts/indic/cmu_indic_hin_ab//festival/trees/cmu_indic_hin_mcep.params SIOD ERROR: could not open file /home/shrikant/festivalhindi tts/indic/cmu_indic_hin_ab//festival/trees/cmu_indic_hin_mcep.tree awk: cmd. line:2: fatal: division by zero attempted awk: cmd. line:2: fatal: division by zero attempted awk: cmd. line:2: fatal: division by zero attempted awk: cmd. line:2: fatal: division by zero attempted

Not gone further after this much of error......

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/festvox/festival/issues/13#issuecomment-414115565, or mute the thread https://github.com/notifications/unsubscribe-auth/AV3IMUuOTK5KHhreQ45-lW3QUojHE9Kjks5uSTDBgaJpZM4WCooo .

skmalviya commented 6 years ago

same as mentioned in the script fest_build. I just ran the script and then source the "export_various_PATHS.sh" in order to export them. export_various_PATHS.txt fest_build.txt

saikrishnarallabandi commented 6 years ago

This seems the issue. SIOD ERROR: could not open file ./festvox/language_variant.scm

I see that you have latest versions. Just to be sure, can you create a new directory and run only the prompt building command ( ./bin/do_build build_prompts) that gave this error.

Let me know if this happens again

skmalviya commented 6 years ago

shall I put the content of file ./festvox/language_variant.scm as hin? And what about the issue2, issue3 and issue4?

saikrishnarallabandi commented 6 years ago

Issues 2 through 4 are caused by 1.

The content of /festvox/language_variant.scm should be 'hin' by default.

saikrishnarallabandi commented 6 years ago

Note that the sample build script in fest_build.txt is for English.

When building an indic voice, the command to setup directory structure is: $FESTVOXDIR/src/clustergen/setup_cg_indic cmu indic hindi ab # 4 arguments instead of 3

skmalviya commented 6 years ago

Hello saikrishna!

I followed exactly as instructed. commands I executed are as below: mkdir cmu_indic_hin_ab cd cmu_indic_hin_ab $FESTVOXDIR/src/clustergen/setup_cg_indic cmu indic hin ab And I put wavs in /wav folder and txt.done.data inside /etc, both are of size "100" after this I ran this script directly having all the new steps as told earlier" sh script script.txt

Please see, I attached complete directory in a zip. cmu_indic_hin_ab

skmalviya commented 6 years ago

One more point! During the execution of step13, a continuous stream of multiple segmentation fault errors are coming: ./bin/do_clustergen parallel cluster etc/txt.done.data.train Errors at step 13.txt

For other steps (1 --> 12), out files are there in the above attached zipped folder.

Not able to figure out where the problem is actually because I dont see any difference in errors earlier and now.

Thanks for the support and help saikrishna btw.

saikrishnarallabandi commented 6 years ago

For some reason I am unable to download the directory. Can you attach ou1, out2 here

skmalviya commented 6 years ago

Yes please have a look

out1.txt out2.txt

saikrishnarallabandi commented 6 years ago

Nothing wrong with these. I next am suspecting some issue in feature extraction. Can you attach out6 , out7, out8.

skmalviya commented 6 years ago

OK out6.txt out7.txt out8.txt

saikrishnarallabandi commented 6 years ago

The problem is with out8. That step is extracting "Strengths of excitation" for using as Mixed Excitation. The script fails saying it doesnt recognize the command 'str_sptk'

There are two things we can do for this:

(1) Ignore this and continue voice building.

In this case, modify the next step to the following ./bin/do_clustergen parallel combine_coeffs_v etc/txt.done.data

from ./bin/do_clustergen parallel combine_coeffs_me etc/txt.done.data

combine_coeffs_me uses strengths of excitation combine_coeffs_v ignores them.

In this case, we also need to modify the clustergen.scm file and indicate that we are not using mixed excitation. An easy way to do this is the following:

cp festvox/clustergen.scm.xxx festvox/clustergen.scm # ( We previously made an explicit in this file through steps 9 and 10 that we would be using Mixed excitation. So we are just reverting.)

Now you can run the clustering step: ./bin/do_clustergen parallel cluster etc/txt.done.data.train

(2) The other (and real) solution is to dig deeper into why str_sptk is failing. Can you paste the file ./bin/do_clustergen here so that I can inspect it. It should support the argument 'str_sptk'

skmalviya commented 6 years ago

For (2), Bear if you can the do_clustergen file is here

do_clustergen.txt

(1) Let me incorporate it.

saikrishnarallabandi commented 6 years ago

Wait. I just noticed that the spelling is incorrect in the step8 in the script you shared.

It should be str_sptk not str_stpk

saikrishnarallabandi commented 6 years ago

Once things run smoothly till 'cluster', I'd say run the duration model, the following without 'parallel' ./bin/do_clustergen dur etc/txt.done.data.train

instead of

./bin/do_clustergen parallel dur etc/txt.done.data.train

saikrishnarallabandi commented 6 years ago

I realize that I made that spelling error when I shared the steps. Sorry for that :)

skmalviya commented 6 years ago

Still the same situation. Please have a look again what I did this time.

  1. I ran : $FESTVOXDIR/src/clustergen/setup_cg_indic cmu indic hin ab # again after emptying the directory except the script
  2. Copied again wav and txt.done.data to respective directory.
  3. Ran the command : sh script script.txt

Got the ouput files: out1.txt out2.txt out3.txt

out6.txt out7.txt out8.txt out9.txt out11.txt

Again_Errors at step 13.txt

Complete Directory in a Zipped

saikrishnarallabandi commented 6 years ago

Step 11 has an error on the last phone z_3.

Can you run that step again.

Once that runs successfully, should be fine.

I also notice that there are seg faults in step 13 log. Segmentation fault might also be occuring due to less space being allocated. There is a parameter called SIODHEAPSIZE in do_clustergen. Increasing that should alleviate this fault.

skmalviya commented 6 years ago

Step 11 you mean to say: this command ./bin/do_clustergen parallel cluster etc/txt.done.data.train > out11

I checked : SIODHEAPSIZE=20000000 in the ./bin/do_clustergen file I increased to one more zero.... SIODHEAPSIZE=200000000 Now its giving the error : "WALLOC: failed to malloc -424509440 bytes"

saikrishnarallabandi commented 6 years ago

@step 11 yes

Just double the heap size and see ( not multiplying by 10). This is usually not necessary tbh

skmalviya commented 6 years ago

With updated SIODHEAPSIZE=25000000 I ran again command ./bin/do_clustergen parallel cluster etc/txt.done.data.train > out11 Still the situation is same... out11.txt ErrorStep13.txt # These Errors comes on terminal while execution of the command

saikrishnarallabandi commented 6 years ago

Hi,

I was able to download the zip. When I ran the step, it did finish without any issues.

Here are the last lines from log:

RMSE 0.1516 Correlation is 0.9867 Mean (abs) Error 0.0963 (0.1171) Dataset of 74 vectors of 67 parameters from: festival/feats/9r=_3.feats Dataset of 74 vectors of 67 parameters from: festival/feats/9r=_3.feats RMSE 0.1330 Correlation is 0.9899 Mean (abs) Error 0.0837 (0.1035) Dataset of 73 vectors of 67 parameters from: festival/feats/9r=_2.feats Dataset of 73 vectors of 67 parameters from: festival/feats/9r=_2.feats RMSE 0.1317 Correlation is 0.9898 Mean (abs) Error 0.0789 (0.1054) Dataset of 24 vectors of 67 parameters from: festival/feats/9r=_1.feats Dataset of 24 vectors of 67 parameters from: festival/feats/9r=_1.feats RMSE 0.1605 Correlation is 0.9862 Mean (abs) Error 0.0977 (0.1273) RMSE 0.1527 Correlation is 0.9856 Mean (abs) Error 0.0958 (0.1189) RMSE 0.1371 Correlation is 0.9911 Mean (abs) Error 0.0832 (0.1089) RMSE 0.1264 Correlation is 0.9922 Mean (abs) Error 0.0766 (0.1005) Collect trees 184 unittypes as 1829 subunittypes dumped Tree models and vector params dumped

I was able to finish the duration model ( next step) and generate test samples too.

This is weird since I am essentially continuing from your folder structure

saikrishnarallabandi commented 6 years ago

@awbcmu Can you look into this

festvox commented 6 years ago

Given the failure of the missing language_variant.scm file I suspect initialization with the wrong version might be the culprit. Also note it should be str_sptk not str_stpk as the option.

Another suggestion it running with the parallel option. If you run out of memory and something dies, that might be hard to detect in the next step.

I would regenerate the templates, and then copy in the waveforms and txt.done.data

saikrishnarallabandi commented 6 years ago

@shrikant6153 can you run it without 'parallel' once

skmalviya commented 6 years ago

with or without parallel : ./bin/do_clustergen cluster etc/txt.done.data.train > out11 I stuck with the same error.... sample of it given below.... -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- paste: 1081.f0: No such file or directory sort: cannot read: festival/feats/i_1.feats.unsorted: No such file or directory sort: cannot read: festival/disttabs/i_1.mcep.unsorted: No such file or directory rm: cannot remove 'festival/disttabs/i_1.mcep.unsorted': No such file or directory -=-=-=-=-=- EST Error -=-=-=-=-=- Tried to extract channel number 0 from track with only 0 channels

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- paste: 1081.f0: No such file or directory sort: cannot read: 'festival/feats/i:_1.feats.unsorted': No such file or directory sort: cannot read: 'festival/disttabs/i:_1.mcep.unsorted': No such file or directory rm: cannot remove 'festival/disttabs/i:_1.mcep.unsorted': No such file or directory -=-=-=-=-=- EST Error -=-=-=-=-=- Tried to extract channel number 0 from track with only 0 channels

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- paste: 1081.f0: No such file or directory sort: cannot read: festival/feats/i_1.feats.unsorted: No such file or directory sort: cannot read: festival/disttabs/i_1.mcep.unsorted: No such file or directory rm: cannot remove 'festival/disttabs/i_1.mcep.unsorted': No such file or directory -=-=-=-=-=- EST Error -=-=-=-=-=- Tried to extract channel number 0 from track with only 0 channels

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- paste: 1081.f0: No such file or directory sort: cannot read: 'festival/feats/i:_1.feats.unsorted': No such file or directory sort: cannot read: 'festival/disttabs/i:_1.mcep.unsorted': No such file or directory rm: cannot remove 'festival/disttabs/i:_1.mcep.unsorted': No such file or directory -=-=-=-=-=- EST Error -=-=-=-=-=- Tried to extract channel number 0 from track with only 0 channels

skmalviya commented 6 years ago

And then lots of segmentation faults. as I said earlier....

saikrishnarallabandi commented 6 years ago

The errors suggest an issue with combining coefficients.

Mixed excitation flag is on in your clustergen file. But I saw that they were not being used in combined coefficients. So, lets do this again: Run this sequence:

Combine coeffs with mixed excitation

./bin/do_clustergen parallel combine_coeffs_me etc/txt.done.data

Ensure that mixed excitation flag is on

1) Open festvox/clustergen.scm 2) Check for line that says (set! cg:mixed_excitation 3) Make sure that it is terminated by t) and not nil)

Then run cluster ./bin/do_clustergen parallel cluster etc/txt.done.data

festvox commented 6 years ago

Are you sure that SPTK is installed and running properly. At the commend prompt

$SPTKDIR/bin/x2x -h

should print out the help for the x2x command, if this doesn't work, it would explain when F0 extraction and me/str extraction isn't working

skmalviya commented 6 years ago

This is what I got. Looks still the same... out11.txt image

skmalviya commented 6 years ago

On the execution : $SPTKDIR/bin/x2x -h

image

festvox commented 6 years ago

Its only a small number of phonemes that seem not to have an F0 generated, (which are relatively rare) which makes me think they didn't align properly. Can you try it with 500 utterance not just 100 utterances.

skmalviya commented 6 years ago

Even on 500 utterances its giving the same error... script.txt

out1.txt out2.txt out3.txt

out6.txt out7.txt out8.txt

out9.txt out11.txt

Segmentation Faults is same as before..... Is there problem in installation or path setting. I have used fest_build and export_various_PATHS.sh for installation and path-setting:

fest_build.txt export_various_PATHS.txt

Please suggest except all this what & where could be the problem, Because Mr. Sai Krishna has build it (link) successfully on his system which I am unable to.

saikrishnarallabandi commented 6 years ago

Can you also put out4 and out5? It might be useful just in case

skmalviya commented 6 years ago

They are empty actually because in those steps no output is being generated.

saikrishnarallabandi commented 6 years ago

makes sense. wanted to confirm that. sorry no idea why this is failing.

saikrishnarallabandi commented 6 years ago

what is the config of your machine?

skmalviya commented 6 years ago

image

saikrishnarallabandi commented 6 years ago

this should be more than enough.

i was thinking of ways to resolve the issue. would it help if i share a docker container?

skmalviya commented 6 years ago

It would be great help to me. I think if you could give me the required scripts (or complete setup) in working condition that could be used to build Indic TTS from scratch, would be more helpful.

saikrishnarallabandi commented 6 years ago

You can pull this image: docker pull srallaba/festival_demo:24Aug2018

Run it as : docker run -i -t srallaba/festival_demo:24Aug2018 bash

You should see two scripts: fest_build.sh and buildvoice.sh

1) Run fest_build.sh -> Sets up environment, etc. Installs things 2) Export the paths: cd build export ESTDIR=pwd/speech_tools export FESTVOXDIR=pwd/festvox export SPTKDIR=pwd/SPTK cd ../ 3) Run buildvoice.sh

Let me know if there are any issues.

I have included the Dockerfile and stuff in the image so that you can debug if something goes haywire.

skmalviya commented 6 years ago

Eventually, I build the same set of 100 utterances with similar script on another OS (Arch Linux). It worked like a charm. But I dont understand what is the cause of this error in Ubuntu 16.04.

BTW thanks for the supporting me patiently. Thnx Alot Saikrishna, Festvox for the help.

ddavout commented 6 years ago

Hi, what I can tell I have never succeeded to build a clustergen voice with utf8 files under Ubuntu, but as I was more interested in HTS, I gave up.. There I noticed that dumpfeats gave rise to segmentation faults until I use a dumpfeat from the festival installed on the system 2.4:release Now I have a beautiful Clunits and I am starting to play with HTK

Do I have to mention that I am not an expert ...? :)

2018-08-25 9:29 UTC, shrikant6153 notifications@github.com:

Eventually, I build the same set of 100 utterances with similar script on another OS (Arch Linux). It worked like a charm. But I dont understand what is the cause of this error in Ubuntu 16.04.

BTW thanks for the supporting me patiently. Thnx Alot Saikrishna, Festvox for the help.

-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/festvox/festival/issues/13#issuecomment-415956397

shashankrnr32 commented 5 years ago

The errors suggest an issue with combining coefficients.

Mixed excitation flag is on in your clustergen file. But I saw that they were not being used in combined coefficients. So, lets do this again: Run this sequence:

Combine coeffs with mixed excitation

./bin/do_clustergen parallel combine_coeffs_me etc/txt.done.data

Ensure that mixed excitation flag is on

1. Open festvox/clustergen.scm

2. Check for line that says (set! cg:mixed_excitation

3. Make sure that it is terminated by t) and not nil)

Then run cluster ./bin/do_clustergen parallel cluster etc/txt.done.data

I am getting the following error when i run the line ./bin/do_clustergen parallel etx/text.done.data.train

Setting clustergen params Setting up numbered_files Feature dump SIOD ERROR: could not open file festival/disttabs/unittypes BACKTRACE: 0: (load "festival/disttabs/unittypes" t) 1: (mapcar (lambda (u) (list (string-append u))) (load "festival/disttabs/unittypes" t)) 2: (set! cg::unittypes (mapcar (lambda (u) (list (string-append u))) (load "festival/disttabs/unittypes" t))) 3: (begin (set! cg:parallel_tree_build t) (build_clustergen "etc/txt.done.data.train")) Can somebody take a look and let me know where I went wrong?

ddavout commented 5 years ago

just a suggestion, run (first) ./bin/do_clustergen cluster etc/txt.done.data that is to say without parallel, you may be able to better trace the error

Sent with ProtonMail Secure Email.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ Le vendredi, février 1, 2019 3:35 PM, Shashank Sharma notifications@github.com a écrit :

The errors suggest an issue with combining coefficients.

Mixed excitation flag is on in your clustergen file. But I saw that they were not being used in combined coefficients. So, lets do this again: Run this sequence:

Combine coeffs with mixed excitation

./bin/do_clustergen parallel combine_coeffs_me etc/txt.done.data

Ensure that mixed excitation flag is on

  1. Open festvox/clustergen.scm

  2. Check for line that says (set! cg:mixed_excitation

  3. Make sure that it is terminated by t) and not nil)

Then run cluster ./bin/do_clustergen parallel cluster etc/txt.done.data

I am getting the following error when i run the line ./bin/do_clustergen parallel etx/text.done.data.train

Setting clustergen params Setting up numbered_files Feature dump SIOD ERROR: could not open file festival/disttabs/unittypes BACKTRACE: 0: (load "festival/disttabs/unittypes" t) 1: (mapcar (lambda (u) (list (string-append u))) (load "festival/disttabs/unittypes" t)) 2: (set! cg::unittypes (mapcar (lambda (u) (list (string-append u))) (load "festival/disttabs/unittypes" t))) 3: (begin (set! cg:parallel_tree_build t) (build_clustergen "etc/txt.done.data.train"))

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

shashankrnr32 commented 5 years ago

just a suggestion, run (first) ./bin/do_clustergen cluster etc/txt.done.data that is to say without parallel, you may be able to better trace the error Sent with ProtonMail Secure Email. ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ Le vendredi, février 1, 2019 3:35 PM, Shashank Sharma notifications@github.com a écrit :

The errors suggest an issue with combining coefficients. > > Mixed excitation flag is on in your clustergen file. But I saw that they were not being used in combined coefficients. So, lets do this again: > Run this sequence: > > Combine coeffs with mixed excitation > > ./bin/do_clustergen parallel combine_coeffs_me etc/txt.done.data > > Ensure that mixed excitation flag is on > > 1. Open festvox/clustergen.scm > > 2. Check for line that says (set! cg:mixed_excitation > > 3. Make sure that it is terminated by t) and not nil) > > Then run cluster > ./bin/do_clustergen parallel cluster etc/txt.done.data I am getting the following error when i run the line ./bin/do_clustergen parallel etx/text.done.data.train Setting clustergen params Setting up numbered_files Feature dump SIOD ERROR: could not open file festival/disttabs/unittypes BACKTRACE: 0: (load "festival/disttabs/unittypes" t) 1: (mapcar (lambda (u) (list (string-append u))) (load "festival/disttabs/unittypes" t)) 2: (set! cg::unittypes (mapcar (lambda (u) (list (string-append u))) (load "festival/disttabs/unittypes" t))) 3: (begin (set! cg:parallel_tree_build t) (build_clustergen "etc/txt.done.data.train")) — You are receiving this because you commented. Reply to this email directly, [view it on GitHub](#13 (comment)), or mute the thread.

Its the same. I traced the error to these scripts bin/do_clustergen and festvox/clustergen_build.scm. screenshot from 2019-02-01 21-34-00

The script is loading from festival/disttabs/unittypes but the mentioned directory is empty. I have followed every step given in the website. I am also attaching the terminal image for reference