neuro-team-femto / cleese

Combinatorial Expressive Speech Engine
MIT License
42 stars 10 forks source link

Configuration parameters #11

Closed seunggookim closed 2 years ago

seunggookim commented 2 years ago

I am trying to replicate some 'pitch' transformations done with the MATLAB version and having difficulties replicating them. I matched all other parameters, but I've noticed some parameters do not exist in the Python version:

st_pars.svp.b_preserveTransients = 1;
st_pars.svp.b_normalizeOutput    = 1;
st_pars.pitch.b_preserveEnvelope = 1; 

Output Normalization can be easily done via RMS equalization, but it seems that the envelopes look different: image Also, the phases look different: image

Perhaps, are there any options to reduce this discrepancy?

jjau commented 2 years ago

Hello, These parameters (transient preservation, spectral envelope preservation) correspond to an old, deprecated version of cleese, which used the (closed source) SuperVP audio engine in MATLAB. These parameters are not currently implemented in the current python version, which uses a simpler but open-source implementation of the phase vocoder. Future versions could implement these two additional features, if needed, but this is not currently deemed a priority.