vixalien / decibels

Play audio files
https://gitlab.gnome.org/vixalien/decibels
GNU General Public License v3.0
52 stars 17 forks source link

Normalise the waveform #26

Closed RedAuburn closed 10 months ago

RedAuburn commented 11 months ago

This is technically an issue with the music i think, but normalising the waveform would be a good fix

image

vixalien commented 11 months ago

Thanks for bringing this up. However my music skills are limited to say the least. I'm not sure how to adjust the waveform. I'll be trying to ask experts in GStreamer.

On Mon, 16 Oct 2023, 21:05 RedAuburn, @.***> wrote:

This is technically an issue with the music, but normalising the waveform would be a good fix

[image: image] https://user-images.githubusercontent.com/26939824/275623836-3e074e6c-3c0d-4bbe-a829-a73a8c020882.png

— Reply to this email directly, view it on GitHub https://github.com/vixalien/decibels/issues/26, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJB5FCLJ75JSTSEGEER3EP3X7WAOJAVCNFSM6AAAAAA6CVGPNOVHI2DSMVQWIX3LMV43ASLTON2WKOZRHE2DKOJQGU2DIOA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

aadilayub commented 11 months ago

yeah same here

image

prometoys commented 11 months ago

I think this is a problem with over produced songs (compression etc.). How does it look in Ardour, Audacity etc.?

I don't you can do much about it.

RedAuburn commented 11 months ago

looks like there may already be a plugin for it in gstreamer: https://coaxion.net/blog/2020/07/live-loudness-normalization-in-gstreamer-experiences-with-porting-a-c-audio-filter-to-rust/

vixalien commented 10 months ago

I have this working on a local branch. However, it introduces many problems such as the complexity of building rust modules and the normalisation is painfully slow.

I'm evaluating this solution to see if it's worth it, and how I can help make the experience better (by caching the waveform, maybe) and progressively generating the waveform.

prometoys commented 10 months ago

Honestly I'm not sure, if it worth it. This is mostly a problem of "overproduced" music files. What is the goal of decibel? To cover all audio files or mainly voice recordings?

Maybe it could be a hack to tinker a bit with the values, like in src/waveform.ts#L309.

And as far as I understood the gstreamer plugin it is intended to normalize the audio. But wouldn't it enough to get the peak values (as numbers) and just normalize the view instead of the audio? Maybe I missed something important.

vixalien commented 10 months ago

This is mostly a problem of "overproduced" music files. What is the goal of decibel? To cover all audio files or mainly voice recordings?

This is true, but I feel like most audio tracks are overproduced. By handling those, we make the app more efficient for more audio files. I want to know how apps like Audacity handle it.

Maybe it could be a hack to tinker a bit with the values, like in src/waveform.ts#L309.

I've already tried that, but AFAIK the code just "translates" data from the logarithmic scale to the linear scale. If anyone can tune the numbers to get an acceptable result, I could integrate it.

But wouldn't it enough to get the peak values (as numbers) and just normalize the view instead of the audio?

not sure how to do this. in the case of overproduced music, the peaks just show like [1,1,1,1,1,1,...] i.e. the max peak. when the audio is normalized, the peaks are still higher but not [1,1,1,1,...]

with the normalization module, I'm just concerned that decibels will eat more RAM and CPU especially on mobile devices like the pinephone and librem 5 with limited processing power.

vixalien commented 10 months ago

I invite you to download the latest builds from #29 and check if the pull request works for you.

I would also like to see if the feature is as fast on mobile.

prometoys commented 10 months ago

I tested it on my laptop. With a 3 minute music file everything is ok. But while decibels from flathub needs less then 10s to load the waveform for a 40min interview recording, the ui from #29 is irresponsible for more then 3 minutes. It playes the audio file and reacts to play/pause, but the ui doesn't update, no window resize, scrollbar useless etc.

CPU: i5, 12th Gen, energy save mode.

I overproduced music exactly [1,1,1,1,1,1,...] or more like [0.99,0.97,0.98,0.99,0.96,...] ?

With overproduced music you would still have the timecode as a hint. I would vote against normalization, if it eats up so much CPU and memory, because then the app is less useful for long recordings and weak devices.

Questions:

Which solution does other tools use?

vixalien commented 10 months ago

I agree that the PR #29 is overkill. I've decided to use another alternative which is similar to what Audacity uses in PR #30. Testing is welcome.

prometoys commented 10 months ago

Hi,

on my laptop it works good with a music file, but I have problems to test it with recordings.

It works with a very short recording with few seconds, but with long files, even just 1:00 minute, the app freezes. I waited up to two minutes, but it didn't changed.

test-recordings.zip

(I recorded them all with sound-recorder from fedora flatpak repo, but it seems to be the same version as on flathub, 43.beta)

There seems to be a bug with some files, which is present in stable/main to (tested with Decibels from Flathub. I got a warning like this:

(com.vixalien.decibelsDevel:2): Gjs-WARNING **: 12:01:12.187: Value 18446744073709551615 cannot be safely stored in a JS Number and may be rounded

Instead of the correct length the label on the left side shows something like 5124095:34:33.

grafik

vixalien commented 10 months ago

There seems to be a bug with some files, which is present in stable/main to (tested with Decibels from Flathub. I got a warning like this:

This seems to be a different bug. I recommend you create a new issue. Thanks