sergree / matchering

🎚️ Open Source Audio Matching and Mastering
https://pypi.org/project/matchering/
GNU General Public License v3.0
1.71k stars 178 forks source link

python version #4

Closed yoyolicoris closed 4 years ago

yoyolicoris commented 4 years ago

Create a python re-write of matchering, because matlab is not that easy to acquire for normal user.

Differences

Have run this script on my laptop very smoothly, but I can't compare the differences cuz I currently don't have matlab installed device (the last time I use matlab was in college lol).

sergree commented 4 years ago

Wow! It's so impressive, thank you! I need to test, review and compare it's results to original results.

πŸ™πŸ™πŸ™

sergree commented 4 years ago

Tested on fresh Ubuntu 18.04 install: Need to sudo apt install libsndfile-dev first.

I continue to review...

sergree commented 4 years ago

Example of usage: python3 matchering.py "/vagrant/target32.wav" "/vagrant/reference.wav" --output_dir="/vagrant"

I continue to review...

sergree commented 4 years ago

@yoyololicon, the result is impressive! You've done so much. You are the man! I merged it, of course.

What do you recommend for sound limiting? Original Matchering used Voxengo Elephant with default preset for it. We need such transparent brickwall limiter here. I messaged Elephant's author about our work, but he doesn't want to participiate in open source stuff.


What i would like to do by myself next: ~~1) Edit your _to_db() function because it gives me wrong results. Recommend adjust amplitude about 3.953035391119278 db when normalize_scale = 2.484869234834199 But the correct value needs to be 7.906070782238555 See this site to test~~ 2) Refactor the code abit, split the code into modules 3) Django-based containerized web app for it using Docker. What do you think? @yoyololicon 4) Make README, etc.


If I comment out this line final_output /= normalize_scale the MATLAB and Python results almost the same. It's so cool. πŸ‘ πŸ’―

sergree commented 4 years ago

@yoyololicon About (1): Done.

https://github.com/wokashi-rg/Matchering/commit/d33cea52d578fe307d9b35b14cef7cafbc69b3ea

yoyolicoris commented 4 years ago

@yoyololicon, the result is impressive! You've done so much. You are the man! I merged it, of course.

Thanks man!

What do you recommend for sound limiting? Original Matchering used Voxengo Elephant with default preset for it. We need such transparent brickwall limiter here. I messaged Elephant's author about our work, but he doesn't want to participiate in open source stuff.

In my opinion, let user decided to use their favorite limiter plugin would be the best solution, but I agree a limiter is definitely needed, to make the final result more complete. I remember that ReaPlugs has limiter effect build-in, and it's open source but written in their own JSFX language so it would need some effort to re-write.

  1. Edit your _to_db() function because it gives me wrong results. Recommend adjust amplitude about 3.953035391119278 db when normalize_scale = 2.484869234834199 But the correct value needs to be 7.906070782238555 See this site to test

I forgot to take the power of magnitude, so dumb Orz

  1. Django-based containerized web app for it using Docker. What do you think? @yoyololicon

Not know mush about web programming but sounds intereseting.

If I comment out this line final_output /= normalize_scale the MATLAB and Python results almost the same. It's so cool.

I add this line is because soundfile would clip the value outside [-1, 1], and it sounds horrible when I test on my mixes. So I normalize it before write into file, and display the compensation value to user. If the result are the same on your computer, I think remove this line is ok.

yoyolicoris commented 4 years ago

Tested on fresh Ubuntu 18.04 install: Need to sudo apt install libsndfile-dev first.

we can add this command in readme quick start section. Also, maybe we can just use python standard package wave instead, if only .wav files are acceptable. This remove the dependency on libsndfile.

sergree commented 4 years ago

@yoyololicon

we can add this command in readme quick start section

Done

Not know mush about web programming but sounds intereseting.

Yes, I would be happy to do that, hopefully this month

In my opinion, let user decided to use their favorite limiter plugin would be the best solution, but I agree a limiter is definitely needed, to make the final result more complete. I remember that ReaPlugs has limiter effect build-in, and it's open source but written in their own JSFX language so it would need some effort to re-write.

Yes, i agree too. We could give the user the choice:

yoyolicoris commented 4 years ago

@sergree

Yes, i agree too. We could give the user the choice:

  • Use our brickwall limiter
  • Don't use our bw limiter, to use some external stuff If you had a free time to add a good bw limiter to your code, I would do the rest (django web app + docker container + description + etc.).

Sounds like a good idea. Although it might need some times to study, I can take the limiter part.

sergree commented 4 years ago

@yoyololicon

https://stackoverflow.com/questions/34833846/how-to-amplify-sounds-without-distortion-in-python

I could check it out. Looks interesting.

sergree commented 4 years ago

Yes arctan_compressor is what we need here. Need to play with factor param to get better results. I'll report the result later.

sergree commented 4 years ago

Or not. :)

https://en.wikipedia.org/wiki/Dynamic_range_compression#/media/File:Clipping_compared_to_limiting.svg

arctan_compressor is Soft Clipping. But the right one here is brickwall.

romansavrulin commented 4 years ago

@yoyololicon @sergree Wow, Guys! You've done a great job! That's awesome news we have python version now! The only thing that kept me out of evaluation of original algorithm was MatLab. Now we can bring this up even as a docker service!

sergree commented 4 years ago

Moved this to:

https://github.com/wokashi-rg/Matchering/projects/1

yoyolicoris commented 4 years ago

@sergree Have tried implement a basic peak limiter, but due to it's sequential nature, the speed is not that good (half minute to run on a 4 minutes song). Probably due to the attack/release envelope follower. To get the peak envelop without the for loop, I have an idea:

  1. apply a moving maxout filter (aka max-pooling) on the rectified signal. This acts like a time holder in the attack phase.
  2. apply 2 different first order IIR filter on the rough envelop (1) to simulate the attack/release envelope follower.

In this way, the runtime should be a lot faster. I would take some time to exam this concept this weekend.

sergree commented 4 years ago

Thank you, @yoyololicon, your idea is great.

I had some similar thoughts about offline Peak Limiter these days:

1) Find maximum of Left & Right channel with np.maximum() - so that the limiter works the same on both channels

2) Get numpy.abs() of this vector (rectify)

3) Create another vector whose elements are:

4) Transfer (3) samples from [0..1] to [1..0]

Next we can do smth with (4) to make it more smooth, to prevent hard clipping, but so that its values DO NOT decrease. For e.g.:

??? 5) ??? Convolve (4) with some sampled Attack-Release curve ??? like this: https://pasteboard.co/IKJlchC.png

6) Transfer (5) samples from [0..1] to [1..0] back

7) Element-wise multiply (6) by the original audio track, we got the result!

I'M REALLY NOT SURE ABOUT (5)! Because, I think, the effect will be too strong, depending on the more overloaded samples will be near.

I hope my ideas will be useful to you too.

P.S. I will provide some industry standard attack / release values (in ms or N of samples) this week for the Peak Limiter. Need to research it.

sergree commented 4 years ago

Tested on pure square signal

Attack: 1,24 ms - 1,36 ms or 55-60 samples at 44100 https://pasteboard.co/IKSpOI7.png

Release: 3 sec - 6 sec https://pasteboard.co/IKSqmkZ.png

yoyolicoris commented 4 years ago

Was working around the max filter and IIR filter idea these days, and here's the result, simulate on a 200 hz sin wave:

  1. peak envelope limiter_env

  2. output after limiter limiter_out

I did some sound tests and it sounds similiar to regular peak limiter

To recreate the result, here's the code:

def offline_limiter(y, sr, attack=5, release=100, hold_time=8, threshold=-6, ceil=-0.1):
    M = int(sr * attack * 1e-3)
    K = int(sr * hold_time * 1e-3)
    if not M & 1:
        M += 1
    at = math.exp(-2.2 / (attack * 0.001 * sr))
    rt = math.exp(-2.2 / (release * 0.001 * sr))
    thresh = 10 ** (threshold / 20)
    ceil = 10 ** (ceil / 20)
    vol = ceil / thresh

    output_len = y.shape[0]
    rect_y = np.abs(y).max(1)

    # hold maximum value to make sure envelope can reach its maximum in attack stage
    unfold_rect_y = np.pad(rect_y, (2 * M - 1, 0), 'constant', constant_values=0)
    unfold_rect_y = np.lib.stride_tricks.as_strided(unfold_rect_y, (output_len, 2 * M), unfold_rect_y.strides * 2)
    raw_env = unfold_rect_y.max(1)

    # simulate attack curve using forward and backward IIR
    env_a = signal.filtfilt([1 - at], [1, -at], raw_env)

    # hold maximum peak longer to avoid ripple effect
    unfold_env_a = np.pad(env_a, (K - 1, 0), 'constant', constant_values=0)
    unfold_env_a = np.lib.stride_tricks.as_strided(unfold_env_a, (output_len, K), unfold_env_a.strides * 2)
    env_a = unfold_env_a.max(1)

    # add release decay curve to the envelope
    env_r = signal.lfilter([1 - rt], [1., -rt], env_a)
    final_env = np.maximum(env_a, env_r)

    gain = np.minimum(1, thresh / final_env)
    output = y * gain[:, None] * vol

    return output
sergree commented 4 years ago

Hello, @yoyololicon!

I tested your script, and found some issues.

My test file - Pure square signal at -3db, +3db, -6db, +6db, -3db input This file can contain positive db values like +3 or +6 because it's 32bit float WAV file.

The result i got after your offline_limiter(): output

1st issue: We lost all information about the dynamics of the signal (no -3db or -6db values, just ~-0.001). We don't need leveling stuff here, because the leveling is done very accurate in other Matchering stages. Usual mastering peak limiters don't touch these parts (with default preset). Maybe i need to change threshold / ceil in some way here, but i haven't tried it yet.

2nd issue: We got peaks in the limited output, as you see. Usual digital realtime limiters use look-ahead to prevent them, as i know.

But it already looks impressive, thanks! πŸ™

yoyolicoris commented 4 years ago

1st issue: We lost all information about the dynamics of the signal (no -3db or -6db values, just ~-0.001). We don't need leveling stuff here, because the leveling is done very accurate in other Matchering stages. Usual mastering peak limiters don't touch these parts (with default preset). Maybe i need to change threshold / ceil in some way here, but i haven't tried it yet.

2nd issue: We got peaks in the limited output, as you see. Usual digital realtime limiters use look-ahead to prevent them, as i know.

Updated code base on above issues:

def offline_limiter(y, sr, attack=5, release=100, hold_time=8):
    M = int(sr * attack * 1e-3)
    K = int(sr * hold_time * 1e-3)
    if not M & 1:
        M += 1
    at = math.exp(-2.2 / (attack * 0.001 * sr))
    rt = math.exp(-2.2 / (release * 0.001 * sr))
    thresh = 10 ** (-0.1/ 20)      # fix threshold to -0.1 db

    output_len = y.shape[0]
    rect_y = np.abs(y).max(1)

    # hold maximum value to make sure envelope can reach its maximum in attack stage
    # the moving window is an attack time forwarded to simulate look ahead behavior
    unfold_rect_y = np.pad(rect_y, (M - 1, M - 1), 'constant', constant_values=0)
    unfold_rect_y = np.lib.stride_tricks.as_strided(unfold_rect_y, (output_len, 2 * M - 1), unfold_rect_y.strides * 2)
    raw_env = unfold_rect_y.max(1)

    # simulate attack curve using forward and backward IIR
    env_a = signal.filtfilt([1 - at], [1, -at], raw_env)

    # hold maximum peak longer to avoid ripple effect
    unfold_env_a = np.pad(env_a, (K - 1, 0), 'constant', constant_values=0)
    unfold_env_a = np.lib.stride_tricks.as_strided(unfold_env_a, (output_len, K), unfold_env_a.strides * 2)
    env_a = unfold_env_a.max(1)

    # add release decay curve to the envelope
    env_r = signal.lfilter([1 - rt], [1., -rt], env_a)
    final_env = np.maximum(env_a, env_r)

    gain = np.minimum(1, thresh / final_env)
    output = y * gain[:, None]

    return output

This will retain the original volume and dynamics below -0.1 db, and should have no significant peaks. @sergree could you help me test it out?

sergree commented 4 years ago

Sure, thanks, @yoyololicon.

The leveling issue is fixed, but I still get peaks. Also, if we compare our limiter with some state-of-the-art peak limiter, it has too long attack.

Ours: ours attack = ~ 237 samples

State-of-the-art Peak Limiter (Voxengo Elephant): sota attack = ~ 58 samples

sergree commented 4 years ago

@yoyololicon I made some draft adjustments to params.

    attack = 5 / 237 * 58
    release = 100 / 1700 * 132476
    thresh = 0.998138427734375

Attack is good now. But 2 problems exist:

1) Peaks 2) Incorrect release envelope

lim

yoyolicoris commented 4 years ago

@sergree

Looks like Voxengo's attack time is around 1 ms (mine is 5 ms). We can change that. I guess Voxengo apply their release curve on the gain function, not on the peak envelope, so the output can decay exponentially. About the peaks problem, the only way I can come up with is to hard clip it before output. Because current peaks are not that obvious it should be fine.

sergree commented 4 years ago

@yoyololicon

Thank you again. I already edited the attack time and got good result.

I hope I will have time the coming months to dive into this offline limiter topic and finish it on the part of peaks and release curve.

sergree commented 4 years ago

@yoyololicon

I just wanted to let you know that my idea with novel max-convolution worked. I use some your limiter ideas too. The result limiter will be very accurate, I hope. I need some time to tune the parameters and add features such as a true peak limiting and a transient detection. After I finish everything in the anaconda, I will create a separate branch here and start refactoring and developing the single page web application for it.

Just wanted to ask you for your donation link like buymeacoffee.com, I'll add it to the final version. Thank you very much again for your contribution!

🀝 πŸ™ πŸ₯³

yoyolicoris commented 4 years ago

That's awesome! I'm looking forward to it. Also thank you for sharing your code and idea with us.

sergree commented 4 years ago

@yoyololicon Hello!

My idea with maximum convolution failed. The calculation of max convolution took too long.

But I have greatly modified your limiter and added it to the chain. It works like Voxengo Elephant now, but it is a little more aggressive. Perfect for EDM stuff.

https://github.com/sergree/Matchering/blob/master/python/limiter.py

Next: refactoring + web app from me.

sergree commented 4 years ago

@yoyololicon

https://github.com/sergree/matchering/tree/master/python#our-mastering-limiter-quality-test

yoyolicoris commented 4 years ago

@sergree wow you really did a great job! The result looks very promising. I would like to try it on my mixes and see how it goes.

sergree commented 4 years ago

@yoyololicon

Thanks! I have another question. Perhaps you know how to speed up sliding window function?

I mean this:

print('ATTACK: Sliding window started...')
t = time()
# hold maximum value to make sure envelope can reach its maximum in attack stage
unfold_rect_y = np.pad(rect_y, (2 * M - 1, 0), 'constant', constant_values=0)
unfold_rect_y = np.lib.stride_tricks.as_strided(unfold_rect_y, (output_len, 2 * M), unfold_rect_y.strides * 2)
raw_env = unfold_rect_y.max(1)
print('ATTACK: Done in ', time() - t, ' sec.')

It takes 15 seconds to process an 8-minute mix.

long

UPD: It's because of raw_env = unfold_rect_y.max(1), not np.lib.stride_tricks.as_strided().

UPD 2: Nvm :) I found the scipy.ndimage.filters.maximum_filter1d solution. It works x10 faster at such scale. I will update github code soon.

sergree commented 4 years ago

boost

https://github.com/sergree/matchering/commit/3009e9c93d747dcce79ebee7df044e47866d32ea

sergree commented 4 years ago

Dear @yoyololicon, thank you for resurrecting this project.

I did a complete refactoring, trying to follow the DRY, KISS, and SOLID principles as best I can. I also fixed a few inaccuracies in your version, brought back the lowess algorithm (I checked for a long time, it worked best), added MP3 support via FFmpeg. I also prepared a convenient API and built a package for PyPI. Now it can be installed via pip.

https://pypi.org/project/matchering/ https://github.com/sergree/matchering

I also made a handy command-line application for working with the Matchering library that supports writing logs to a file.

https://github.com/sergree/matchering-cli

I'll take a break for a few days, and then start implementing a containerized web application for Matchering.

πŸ₯³ πŸŽ‰ πŸ™ With greetings from Russia! 🐻🍾🌲

yoyolicoris commented 4 years ago

Dear @yoyololicon, thank you for resurrecting this project.

I did a complete refactoring, trying to follow the DRY, KISS, and SOLID principles as best I can. I also fixed a few inaccuracies in your version, brought back the lowess algorithm (I checked for a long time, it worked best), added MP3 support via FFmpeg. I also prepared a convenient API and built a package for PyPI. Now it can be installed via pip.

https://pypi.org/project/matchering/ https://github.com/sergree/matchering

I also made a handy command-line application for working with the Matchering library that supports writing logs to a file.

https://github.com/sergree/matchering-cli

I'll take a break for a few days, and then start implementing a containerized web application for Matchering.

πŸ₯³ With greetings from Russia!

@sergree Man that's A LOT of hard work! Didn't expect my small contribution would end up went this far haha. Have tried the command line app and it works fluently, pretty amazing.

Because the next goal is to make it as a web service, which is not what I am familiar with, I'll leave it to you and check if there's any improvement or issue I can make that is related to audio processing / music information retrieval.

In summary, I want to thank you for your active support and quick response on this thread. It's my pleasure to work with you and participate in the project. Greetings from Taiwan~ :)

sergree commented 4 years ago

@yoyololicon

We did The Thing.

https://www.youtube.com/watch?v=8Su5STDYfcA

πŸ™

sergree commented 1 year ago

Hey, my friend @yoyololicon I hope you are doing well!

I would like to share with you the WhatBPM, this is my new project, the ideological continuation of Matchering, only from a different angle. It allows you to adopt the best practices of reference tracks even before you start writing music. 😱

Simply put, it is a web service that automatically analyzes EDM trends on a daily basis and outputs recommended values for use in music: BPM, key, root note, track duration, and so on.

I hope you find this useful.

Sincerely from your Russian colleague πŸ’“