Parisson / TimeSide

scalable audio processing framework and server written in Python
https://timeside.ircam.fr/docs/
GNU Affero General Public License v3.0
368 stars 60 forks source link

[waveform] On small segments, min = max #184

Closed gnuletik closed 4 years ago

gnuletik commented 4 years ago

When getting a waveform from the API on a small segment, a lot of points have min = max.

>> console.log(waveformApi.min.length)
1024
>> console.log(waveformApi.min.filter((min, max) => min === waveformApi.max[idx]))
Array(681) [ -0.6583428382873535, -0.7883555889129639, -0.8257352709770203, -0.9179143309593201, -0.9416095614433289, -0.9901597499847412, -0.9986107349395752, -1.000450849533081, -0.9931865334510803, -0.9481220245361328, … ]

On this example, we have 1024 points and 681 points have the same min and max values.

API URL: https://sandbox.wasabi.telemeta.org/timeside/api/items/edca699d-270e-447e-9f90-66272cc741e5/waveform/?start=0.231&stop=0.262&nb_pixels=1024

Player URL: https://ircam-web.github.io/timeside-player/#/item/edca699d-270e-447e-9f90-66272cc741e5?start=231&stop=262

Note: I added a check in the player. If there are duplicated values in the waveform, an error message will be print in the browser's console.

This leads to invalid or empty waveform as we can see in the following screenshot of the Waveform's points.

Screenshot from 2020-04-24 16-34-13

Tointoin commented 4 years ago

On this small segment: 0.262 - 0.231 = 0.031 sec

In [39]: p = (d | w)
In [40]: from timeside.server.models import Item
In [41]: from timeside.plugins.decoder.aubio import AubioDecoder
In [42]: item = Item.objects.get(uuid='edca699d-270e-447e-9f90-66272cc741e5')
In [43]: d = AubioDecoder(item.source_file.path)
In [44]: p = (d | w)
In [45]: p.run()
In [46]: w.input_samplerate
Out[46]: 44100

given the audio file sample rate 44100 Hz (as see above), the number of samples is 0.031 * 44100 = 1367.1

Well, according to the way WaveformSerializer is implemented,

with a nb_pixel of 1024, these 1367 samples are distributed among 1024 frames between the 1025 time values of time in the /waveform/ response.

Nevertheless, some of these frames will have only one sample in this case. So min and max are basically both sample's exact value.

Am I right @yomguy???

I see 2 ways to correct this bug:

I think it is anyway the right time to discuss each and every one of these side cases.

Tointoin commented 4 years ago

Second way has been chosen (and pulled on sandbox):

@gnuletik, beware that a request with a nb_pixel higher than its cap value for given start and stop will be responded with a nb_pixel fixed to this value.

For instance api url mentionned above with nb_pixel=1024 has only nb_pixel=683 in response.

Note that WaveformSerializer still handles reversed waveform (switching start and stop) thanks to abs() use.

gnuletik commented 4 years ago

Awesome :) It works like a charm !

Thanks @Tointoin !