gillesdegottex / pulsemodel

Pulse Model vocoder
Apache License 2.0
41 stars 7 forks source link

Error encountered when run synthesis.py (File "synthesis.py", line 178, in synthesize) #21

Closed dengliqun closed 6 years ago

dengliqun commented 6 years ago

I got the following errors when run the copy-synthesis with the "awb_arctic_a0040.wav" example. Traceback (most recent call last): File "synthesis.py", line 374, in main(sys.argv[1:]) File "synthesis.py", line 371, in main synthesizef(args.fs, shift=args.shift, dftlen=args.dftlen, ff0=args.f0, flf0=args.logf0, fspec=args.spec, ffwlspec=args.fwlspec, ffwcep=args.fwcep, fmcep=args.mcep, fnm=args.nm, ffwnm=args.fwnm, nm_cont=args.nm_cont, fpdd=args.pdd, fmpdd=args.mpdd, fsyn=args.synth, verbose=args.verbose) File "synthesis.py", line 340, in synthesizef syn = synthesize(fs, f0s, SPEC, NM=NM, nm_cont=nm_cont, verbose=verbose) File "synthesis.py", line 178, in synthesize if winlen>dftlen: raise ValueError('winlen({})>dftlen({})'.format(winlen, dftlen)) # pragma: no cover ValueError: winlen(801)>dftlen(118)

I got the f0 spec and nm features with the analysis,py (i'm sure pyworld is used), analysisf("awb_arctic_a0040.wav", shift=args.shift, dftlen=args.dftlen, finf0txt=args.inf0txt, f0_min=args.f0_min, f0_max=args.f0_max, ff0="awb_arctic_a0040.f0", f0_log=args.f0_log, finf0bin=args.inf0bin, fspec="awb_arctic_a0040.spec", spec_mceporder=59, spec_fwceporder=args.spec_fwceporder, spec_nbfwbnds=args.spec_nbfwbnds, fpdd="pdd/"+fileid+".pdd", pdd_mceporder=59, fnm="awb_arctic_a0040.nm", nm_nbfwbnds=None, preproc_fs=args.preproc_fs, preproc_hp=args.preproc_hp, verbose=args.verbose)

and run the synthesis command as python synthesis.py awb_arctic_a0040.resyn.wav --f0 awb_arctic_a0040.f0 --spec awb_arctic_a0040.spec --nm awb_arctic_a0040.nm

Please help. Thx.

gillesdegottex commented 6 years ago

Did you replace the f0 estimator? Maybe changed the minimum value for f0?

I've updated the master with a fix for this case and some warning. please pull the master's updates.

dengliqun commented 6 years ago

@gillesdegottex Thanks much for your reply. I used the REAPER estimator as recommended by default. I have updated to the latest version, but is seems that the same error still exists.

gillesdegottex commented 6 years ago

Can you copy/paste here the whole log of the process?

dengliqun commented 6 years ago

@gillesdegottex here is the full log: 1. It is OK for running if using the defaulted settings, in which the spec order and nbfwbnds would be 2049.

2. But it would fail if these two parameters are set with the recommended values 1). run "analysis.py" : python analysis.py test/awb_arctic_a0040.wav --f0 awb_arctic_a0040.f0 --spec awb_arctic_a0040.spec --nm awb_arctic_a0040.nm --spec_mceporder 59 --nm_nbfwbnds 33

PML Analysis (dur=4.000s, fs=16000Hz, f0 in [60,600]Hz, shift=0.005s, dftlen=4096) Output F0 (797,) in: awb_arctic_a0040.f0 Output Spectrogram size=(797, 60) in: awb_arctic_a0040.spec Output Noise Mask size=(797, 33) in: awb_arctic_a0040.nm

2). then run "synthesis.py": python synthesis.py awb_arctic_a0040.resyn.wav --f0 awb_arctic_a0040.f0 --spec awb_arctic_a0040.spec --nm awb_arctic_a0040.nm

Traceback (most recent call last): File "synthesis.py", line 376, in main(sys.argv[1:]) File "synthesis.py", line 373, in main synthesizef(args.fs, shift=args.shift, dftlen=args.dftlen, ff0=args.f0, flf0=args.logf0, fspec=args.spec, ffwlspec=args.fwlspec, ffwcep=args.fwcep, fmcep=args.mcep, fnm=args.nm, ffwnm=args.fwnm, nm_cont=args.nm_cont, fpdd=args.pdd, fmpdd=args.mpdd, fsyn=args.synth, verbose=args.verbose) File "synthesis.py", line 342, in synthesizef syn = synthesize(fs, f0s, SPEC, NM=NM, nm_cont=nm_cont, verbose=verbose) File "synthesis.py", line 79, in synthesize raise ValueError('spectrogram size {} and NM size {} do not match.'.format(SPEC.shape, NM.shape)) # pragma: no cover ValueError: spectrogram size (797, 60) and NM size (797, 33) do not match.

3. The above error can be covered by when I tried to run the analysis.py with setting "--nm_nbfwbnds 60": PML Analysis (dur=4.000s, fs=16000Hz, f0 in [60,600]Hz, shift=0.005s, dftlen=4096) Output F0 (797,) in: awb_arctic_a0040.f0 Output Spectrogram size=(797, 60) in: awb_arctic_a0040.spec Output Noise Mask size=(797, 60) in: awb_arctic_a0040.nm

but new error would happen for Synthesis: PML Synthesis (dur=3.98s, fs=16000Hz, f0 in [73,168]Hz, shift=0.005s, dftlen=118) PML Synthesis (dur=3.98s, fs=16000Hz, f0 in [73,168]Hz, shift=0.005s, dftlen=118) synthesis.py:108: UserWarning:

WARNING: The maximum window length (883) is bigger than the DFT length (118). Please, increase the DFT length of your spectral features (the second dimension) or check if the f0 curve has extremly low values and try to clip them to higher values (at least higher than 50Hz). The f0 curve has been clipped to 542.372881356Hz.

warnings.warn('\n\nWARNING: The maximum window length ({}) is bigger than the DFT length ({}). Please, increase the DFT length of your spectral features (the second dimension) or check if the f0 curve has extremly low values and try to clip them to higher values (at least higher than 50Hz). The f0 curve has been clipped to {}Hz.\n\n'.format(winlenmax, dftlen, winnbper*fs/float(dftlen))) # pragma: no cover Forcing binary noise mask Traceback (most recent call last): File "synthesis.py", line 375, in main(sys.argv[1:]) File "synthesis.py", line 372, in main synthesizef(args.fs, shift=args.shift, dftlen=args.dftlen, ff0=args.f0, flf0=args.logf0, fspec=args.spec, ffwlspec=args.fwlspec, ffwcep=args.fwcep, fmcep=args.mcep, fnm=args.nm, ffwnm=args.fwnm, nm_cont=args.nm_cont, fpdd=args.pdd, fmpdd=args.mpdd, fsyn=args.synth, verbose=args.verbose) File "synthesis.py", line 341, in synthesizef syn = synthesize(fs, f0s, SPEC, NM=NM, nm_cont=nm_cont, verbose=verbose) File "synthesis.py", line 179, in synthesize if winlen>dftlen: raise ValueError('The window length ({}) is bigger than the DFT length ({}). Please, increase the dftlen of your spectral features or check if the f0 curve has extremly low values and try to clip them to higher values (at least higher than 50[Hz])'.format(winlen, dftlen)) # pragma: no cover ValueError: The window length (801) is bigger than the DFT length (118). Please, increase the dftlen of your spectral features or check if the f0 curve has extremly low values and try to clip them to higher values (at least higher than 50[Hz])

gillesdegottex commented 6 years ago

If you analyse with --nm_nbfwbnds, you have to use --fwnm then during synthesis. Same for --spec_mceporder that needs --mcep at synthesis. Try this first and tell me if there any more errors.

gillesdegottex commented 6 years ago

Seems to be solved.