Closed rabitt closed 4 years ago
I guess the main question is whether this happens in chord.encode_many
or sonify.chroma
:
https://github.com/craffel/mir_eval/blob/master/mir_eval/sonify.py#L324
Any thoughts?
The problem seems to be in sonify.time_frequency
, particularly in this section for linearly transitioning between two chords:
# Pre-allocate output signal
output = np.zeros(length)
time_centers = np.mean(times, axis=1) * float(fs)
for n, frequency in enumerate(frequencies):
# Get a waveform of length samples at this frequency
wave = _fast_synthesize(frequency)
# Interpolate the values in gram over the time grid
if len(time_centers) > 1:
gram_interpolator = interp1d(
time_centers, gram[n, :],
kind='linear', bounds_error=False,
fill_value=0.0)
# If only one time point, create constant interpolator
else:
gram_interpolator = _const_interpolator(gram[n, 0])
# Scale each time interval by the piano roll magnitude
for m, (start, end) in enumerate((times * fs).astype(int)):
# Clip the timings to make sure the indices are valid
start, end = max(start, 0), min(end, length)
# add to waveform
output[start:end] += (
wave[start:end] * gram_interpolator(np.arange(start, end)))
Because the x
values for scipy.interpolate.interp1d
are the centers of each interval, then the period before the center of the first chord interval and the period after the center of the last chord interval are outside the range of values the function will interpolate within. Since fill_value
is set to 0, points outside this range get set to 0
Scipy has a nice option for interp1d
's fill_value
that can deal with this:
If a two-element tuple, then the first element is used as a fill value for x_new < x[0] and the second element is used for x_new > x[-1]. Anything that is not a 2-element tuple (e.g., list or ndarray, regardless of shape) is taken to be a single array-like argument meant to be used for both bounds as below, above = fill_value, fill_value.
Taking advantage of that, we can just tell the interpolator to treat anything left of the first center as gram[0,:]
and anything right of the last center as gram[-1,:]
:
# Interpolate the values in gram over the time grid
if len(time_centers) > 1:
gram_interpolator = interp1d(
time_centers, gram[n, :],
kind='linear', bounds_error=False,
fill_value=(gram[n,0],gram[n,-1]))
This maintains the current linear chord transition behavior (which is anchored at the centers of adjacent intervals) but doesn't miss the first half of the first interval and second half of the last interval.
I made the one-line change in a fork and can create a PR if this is a satisfactory solution.
When running the chord sonification code, I'm noticing that the first chord always starts late, and the last chord ends early. For example:
Since the first chord starts at 0 second and the last one ends at 2 seconds, I'd expect to see audio for the entire duration of the clip, but I don't:
cc @bmcfee