neuro-team-femto / cleese

Combinatorial Expressive Speech Engine
MIT License
42 stars 10 forks source link

BPFs when passed to cleese.process_data ignore BPFtype in the config file and are automatically interpolated as ramps #33

Open jjau opened 2 months ago

jjau commented 2 months ago

Up until at least v2.3.2, CLEESE assumes a slightly different format for bpfs when they are passed directly with the bpf=BPF keyword, compared to when they are generated autonomously by the system. When generated by the system, bpfs can be made as square or ramp, depending on the BPFtype parameter in the config file. When custom bpfs are passed directly however, that parameter is ignored (issue), and whatever bpf values are passed are automatically interpolated linearly between time points, as if it was a ramp. So, if a square bpf is expected, one has to prepare the custom bpf before passing it to cleese.process_data so that its linear interpolation generates the square bpf that we eventually want the function to use - basically copying the same code operation that creates square bpfs in engines/phase_vocoder/bpf.py:

duration = (src_wav.shape[0])/sr

# on s'attend à ce que ça commence par zero
stretch_bpf_times2 = np.insert(stretch_bpf_times, 0,0)

# on dédouble les time points
stretch_bpf_times2 = np.sort(np.concatenate((
            np.array(stretch_bpf_times[1:-1])-0.01/2,
            np.array(stretch_bpf_times[1:-1])+0.01/2)))

# on rajoute début et fin
if stretch_bpf_times[-1] > duration:
    stretch_bpf_times = np.delete(stretch_bpf_times, -1)
stretch_bpf_times2 = np.append(stretch_bpf_times2, duration)
stretch_bpf_times2 = np.insert(stretch_bpf_times2, 0, 0.)

# on dédouble les valeurs
stretch_bpf_val2 = np.repeat(stretch_bpf_val,2)

This should probably be fixed, so that the behaviour is consistent with that used in random generation.

jjau commented 2 months ago

Note another possibility is to force using

time_points = np.array([0.027, 0.634, 1.137, 1.647, 2.185, 2.649, 3.181]) # values found in audacity
num_points = len(time_points)
bpf = PhaseVocoder.create_BPF(
    'stretch',config_file,time_points,num_points,0) 

but that generates random values too