mixxxdj / mixxx

Mixxx is Free DJ software that gives you everything you need to perform live mixes.
http://mixxx.org
Other
4.39k stars 1.26k forks source link

Epic: STEM mixing #13116

Open acolombier opened 5 months ago

acolombier commented 5 months ago

This is an epic issue to track all the work on STEM mixing.

This issue will also be used to keep track of all possibilities about STEM mixing. This may include competitors or original features

Note that this epic only plans to build on top of the open standard created by Native Instrument. While the implementation will aim to support extended case (e.g MP3 or OGG format for channel) , but guidelines will be taken from the open standard.

Tasklist

This is the list of planned COs

COs Description Status
[ChannelX],stem_count The number of available stem on the deck (read-only, 2..4) Implemented in #13086
[ChannelXStemY],volume Adjust the gain of stem #Y (0..1) Implemented in #13086
[ChannelXStemY],color The colour of stem #Y (read-only, 32-bit RGB) Implemented in #13086
[ChannelXStemY],mute Mute the stem #Y (0 or1) Implemented in #13086
[QuickEffectRack1_[ChannelXStemY]],loaded_chain_preset Load the chain preset to the stem #Y Implemented in #13123
[QuickEffectRack1_[ChannelXStemY]],enabled Enable the chain preset to the stem #Y Implemented in #13123
[QuickEffectRack1_[ChannelXStemY]],super1 Adjust the 1st super parameter on the chain preset for the stem #Y Implemented in #13123
[ChannelXStemY],split Split the stem #Y into the sibling deck if empty. Not implemented
[ChannelXStemY],split_into_Z Split the stem #Y into the specified deck #Z if empty. Not implemented
[ChannelXStemY],vu_meter Outputs the current instantaneous stem #Y volume Not implemented
[ChannelXStemY],vu_meter_l Outputs the current instantaneous stem #Y volume for the left channel Not implemented
[ChannelXStemY],vu_meter_r Outputs the current instantaneous stem #Y volume for the right channel Not implemented
[SamplerX],load_selected_track_stem_Y Load the stem #Y from the selected file as stereo track into the sampler #X Implemented in #13268
[PreviewDeck1],load_selected_track_stem_Y Load the stem #Y from the selected file as stereo track into the preview Implemented in #13268
[Library],selected_stem 4-bit mask holding selected by the user for load operation Discussed in #13268 and #13573
[Library],stem_Y_selected Indicate whether or not the stem Y is selected by user for load operation Discussed in #13268 and #13573

(Currently Y goes from 1 to 4

Limitations

Rubberband audio scaling is currently the bottleneck and seem to not allow more than 2 stem decks at a time. When using linear or SoundTouch, all four decks can be used as stem, although CPU usage is significantly increasing compare to stereo decks.

Usecase log

Split a stem deck

When splitting a stem from a deck, the deck will be cloned to the sibling deck (A and C, or B and D), and the given stem will be muted on the current deck, and played solo on the sibling deck.

Deck coping and cloning

Here the expected behaviour from the two existing COs when targeting a deck with stem loaded

Pre-mixed stereo load

selected_stem can be used as followed:

Individual bit mask will be reflected in the stem_Y_selected CO.


Relates to #7935

Dependency of #11391

JoergAtGithub commented 5 months ago

I'm thinking about the proposed stem_X_split CO. There are so many use cases, and it's difficult, maybe impossible, to cover them with COs. Some considerations about stem split mode and this CO:

  1. I wonder if split mode is always only needed with only a single stem split to the other deck, I could imagine also the use case, that a DJ want to have Vocal and Melody on one deck and Bass and Drum on the other. An approach could be binary encoding of the stem number, something like [Channel1], "SplitStemsToDeck3", 0b0101
  2. The current CloneToDeck CO copies the play state, position, rate, and key. I wonder if the use case for stems deck splitting would require to clone also the mixer channel settings. I guess that if a DJ clones stems while playing a track, he want not audible change in the moment of spliting.
  3. Is there a need for a Merge-Decks CO as well? For the case that the DJ want to use the deck in normal mode again.
  4. Should a deck with only one stem have stem controls?
acolombier commented 5 months ago
  1. This issue with this is how easy it would be to map, especially on MIDI devices or with the least amount of boilerplate code. I think what we could do is make the stem_X_split additive, so say you trigger [Channel1],stem_1_split, Deck C now plays the first stem while, A continues playing stems 2 to 4. Now you [Channel1],stem_4_split again, assuming deck A and C are still aligned, C now plays the first and last stem, while A plays the second and third. This should make it pretty easy to map. It's worth to say that stem will only be muted on other decks so you may also manually mute/unmute other stems, say once you've stopped the playback on A or C with the previous example
  2. Good point - I'd say probably yes. Potentially something to make configurable? I can imagine a case where you want to split on deck where you've already pre-adjusted the volume or EQ.
  3. Probably not, as merging would mean unmuting the previously split stem.
  4. The control will be created but inactive. I realise I forgot one CO which is stem_count. This can be used to programmatically detect whether the deck contains stems (0 means no stem or only a main mix, 2 or more is a stem count). I think it makes more sense to have them created all the time, up to the maximum supported stem count (4) so this prevent cases where COs are in-existing or not yet active. Like for the EQ, we might want to have an auto reset feature tho, so volume and effects get back to default on track load.
JoergAtGithub commented 5 months ago

Should these COs only work for main decks, or also for preview decks and samplers? The later would be difficult to visualize in the UI. But I could imagine the use case, that a DJ want's to clone a single stem into a sampler.

Regarding 1. :

acolombier commented 5 months ago

Should these COs only work for main decks, or also for preview decks and samplers?

I don't think so. I don't really see the use case for having stem control in these component

I could imagine the use case, that a DJ want's to clone a single stem into a sampler

I didn't think of this but that sounds like a great usecase indeed. Currently, the expected behaviour of loading a stem file in preview or sampler would lead to load the main mix. I'm going to add COs in the plan for loading a stereo track but target a specific stem. Question: is there a naming convention with COs? I went with snake case, but I can see that LoadSelectedTrack isn't following this naming convention.

Regarding 1. :

JoergAtGithub commented 5 months ago

Please use snake case for new COs, LoadSelectedTrack is legacy naming.

Is the extra CO [ChannelX],stem_Y_split_into_Z really needed? The value of [ChannelX],stem_Y_split is currently unused, why not define it as follows:

acolombier commented 5 months ago

Not particularly needed, but I was under the impression that for MIDI device mapping, having atomic CO which can be triggered is easier for user, seeing some of the latest COs that were added.

JoergAtGithub commented 4 months ago

Besides the new COs, it is also important to consider which of the existing COs will behave differently with stem files, e.g.: [SamplerN]LoadTrackFromDeck Should it reject the operation or load the master mix [ChannelN]CloneFromDeck Should the stem parameter be cloned?

acolombier commented 4 months ago

[SamplerN]LoadTrackFromDeck Should it reject the operation or load the master mix

Currently, it loads the master mix which I think make sense

[ChannelN]CloneFromDeck Should the stem parameter be cloned?

I think it would make sense to clone stem gain and mute state, probably not effect tho. I'm going to add this to #13086

Going to this to Usecase above

JoergAtGithub commented 4 months ago
I profiled the CPU consumption of the engine thread with a stem deck playing and Rubberband R3 running. It seems all CPU is burned in the Rubberband code itself and neither in Mixxx nor in the dependencies of Rubberband: Function Name Total CPU [unit, %] Self CPU [unit, %] Module
| - RubberBand::GuidedPhaseAdvance::advance 1158 (20,52 %) 952 (16,87 %) rubberband-2
| - RubberBand::MovingMedian\<double>::dropAndPut 681 (12,07 %) 674 (11,94 %) rubberband-2
| - RubberBand::MovingMedian\<double>::push 1214 (21,51 %) 502 (8,90 %) rubberband-2
| - calc_output_multi 367 (6,50 %) 364 (6,45 %) samplerate
| - [External Call]vcruntime140.dll!0x00007fffc79310bb 173 (3,07 %) 173 (3,07 %) vcruntime140
| - RubberBand::MovingMedian\<double>::get 114 (2,02 %) 114 (2,02 %) rubberband-2
| - RubberBand::BinClassifier::classify 1453 (25,75 %) 102 (1,81 %) rubberband-2
| - RubberBand::v_multiply_and_add\<double> 93 (1,65 %) 93 (1,65 %) rubberband-2
| - RubberBand::R3Stretcher::synthesiseChannel 877 (15,54 %) 90 (1,59 %) rubberband-2
| - but8b_0_avx2dp 84 (1,49 %) 84 (1,49 %) sleefdft
| - RubberBand::R3Stretcher::analyseChannel 2711 (48,04 %) 83 (1,47 %) rubberband-2
| - but8f_0_avx2dp 83 (1,47 %) 81 (1,44 %) sleefdft
| - RubberBand::v_multiply\<double> 75 (1,33 %) 75 (1,33 %) rubberband-2
| - RubberBand::HistogramFilter::filter 75 (1,33 %) 74 (1,31 %) rubberband-2
| - [External Call]ucrtbase.dll!0x00007fffe0d81aff 74 (1,31 %) 74 (1,31 %) ucrtbase
| - [External Call]ucrtbase.dll!0x00007fffe0d81b14 71 (1,26 %) 71 (1,26 %) ucrtbase
| - RubberBand::FFTs::D_SLEEF::forward 289 (5,12 %) 70 (1,24 %) rubberband-2
| - [External Call]ucrtbase.dll!0x00007fffe0d190fe 69 (1,22 %) 69 (1,22 %) ucrtbase
| - RubberBand::v_convert\<float,double> 65 (1,15 %) 64 (1,13 %) rubberband-2
| - realSub1_avx2dp 50 (0,89 %) 50 (0,89 %) sleefdft
| - RubberBand::FFTs::D_SLEEF::inverse 318 (5,64 %) 48 (0,85 %) rubberband-2
| - tbut8b_0_avx2dp 45 (0,80 %) 45 (0,80 %) sleefdft
| - [External Call]ucrtbase.dll!0x00007fffe0d1a36d 44 (0,78 %) 44 (0,78 %) ucrtbase
| - [External Call]ucrtbase.dll!0x00007fffe0d1a2de 37 (0,66 %) 37 (0,66 %) ucrtbase
| - RubberBand::BinSegmenter::segment 110 (1,95 %) 35 (0,62 %) rubberband-2
| - tbut4b_0_avx2dp 35 (0,62 %) 34 (0,60 %) sleefdft
| - realSub0_avx2dp 34 (0,60 %) 34 (0,60 %) sleefdft
| - tbut4f_0_avx2dp 33 (0,58 %) 33 (0,58 %) sleefdft
| - EngineDeck::processStem 5399 (95,68 %) 32 (0,57 %) mixxx
| - EngineEffectsDelay::process 29 (0,51 %) 29 (0,51 %) mixxx
| - dft8b_0_avx2dp 29 (0,51 %) 29 (0,51 %) sleefdft
| - [External Call]ucrtbase.dll!0x00007fffe0d18efa 29 (0,51 %) 29 (0,51 %) ucrtbase
| - RubberBand::MovingMedian\<double>::getSize 26 (0,46 %) 25 (0,44 %) rubberband-2
| - dft8f_0_avx2dp 21 (0,37 %) 21 (0,37 %) sleefdft
| - [External Call]ucrtbase.dll!0x00007fffe0d18fa9 21 (0,37 %) 21 (0,37 %) ucrtbase
| - RubberBand::R3Stretcher::convertToPolar 365 (6,47 %) 20 (0,35 %) rubberband-2
| - tbut8f_0_avx2dp 20 (0,35 %) 20 (0,35 %) sleefdft

These functions contain large loops with break conditions. Break-Conditions make the loop length unpredictable, and the compiler can't translate it into SIMD instructions.

acolombier commented 4 months ago

Thanks for running these tests! I have put together a PoC on multithreaded rubberband and the results are extremely promising on my end. Could you please have a go at it and tell me if you see an clear improvement?

napaalm commented 4 months ago

Hello, is this implementation independent from the stem file format? For example, in the future it would be useful to use multiple FLAC / WAV files, or multiple channels in a single FLAC file.

JoergAtGithub commented 4 months ago

The scope of this project is limited to STEM files, but in future other soundsources might be added. You can use converters, like the free NI Stem Creator tool (https://www.stems-music.com/stem-creator-tool/) to convert your files to the STEM format. Or, what is more important, you can use Stemgen ( https://stemgen.dev/ ) to split normal tracks into stems.

acolombier commented 4 months ago

is this implementation independent from the stem file format?

Yes and no. The implementation is made so we are not restricting STEM to support only AAC and ALAC as the NI spec specify, but instead support any format for the STEM, as long as they are consistent in a same file.

it would be useful to use multiple FLAC / WAV files

That's not that simple, due to synchronisation across track. The NI stem is made so it guarantees the same bitrate and ensure synchronisation across track (remember that STEM have not transport control and are expected to provide a synchronised sample rate naturally)

multiple channels in a single FLAC file

That would work. Only issue would be to deal with stem definition.

Note that implementation aim to be as agnostic as possible in the engine; as long as the SoundSource provides multiple channels, and a the MetadataSource provides a stem definition (a pair label/color per stem), this will work fine. The issue is around standardising the stem definition, which might be hard and providing a tool to build or convert a stem in the "extended" NI spec that we support (that is, not restricting to AAC and ALAC) might be the easier way instead of creating yet another standard for STEM. For example, it could take the form of an option that would show in the context menu of a track with more multiple channel, allowing to create a stem.m4a version without NI constraints. (or stemgen)

mxmilkiib commented 4 months ago

Just a note that Mixxx supports tracker module playback, which have multiple channels that get rendered to memory as a stereo pair on load, and it would be cool to level/mute/solo/effect these individually, so this is another consideration for a further future soundsource.

acolombier commented 4 months ago

That would be able to make the most of the stem feature developed here, but the same problem remains and a way to provide stem definition will have to be designed/implemented. From the engine perspective, that has already been taken into account

MrPatben8 commented 2 months ago

Hi guys, I just wanted to show some support for this feature. It's one that I find to be incredibly important since it's one of the few features that has me locked into using Traktor. I'd love to transition to using Mixxx but I simply can't live without this feature. I'm own a Traktor Kontrol S8 and a pair of D2s as well as a S4 Mk2 so if there's anything I can do to help in terms of hardware, just say the word. I've also created a program to split and convert regular mp3s into STEM compatible files using machine learning, which you can find here: https://github.com/MrPatben8/AutoStemReloaded It might come in handy if anyone needs more STEM files for testing.

Good luck!

JoergAtGithub commented 2 months ago

You could try the builds from the PRs mentioned above, like #13123, they are already fully useable, but not yet reviewed and polished. Testing with files generated by another tool will be of cause helpful. What we do not support, are the compressor and limiter settings inside the STEM files. Another area where help is needed, is the cross-platform hardware acceleration for stem seperation inside Mixxx. You might have a look at: https://github.com/dszakallas/stemtools/tree/main An idea is to use onnxuntime to run the stem seperation models with GPU / NPU acceleration on all Mixxx platforms. But this seems to be non-trivial.

acolombier commented 2 months ago

Thanks for the support @MrPatben8 ! This is hopefully in good tracks to make into Mixxx 2.6, which go to beta somewhere around Q4. As @JoergAtGithub suggested, your help on testing the PR would be greatly appreciated too. Currently, only the S4 Mk3 has some basic support for STEM (#13126) For anyone looking for stemgen tools or library, I have also published my own which aim to support any formats and allow a fairly large flexibility with stem adjustment. It also use the latest DEMUCSv4, so I guess it would lead to similar result, but it might be simpler to use for anyone using docker/unix systems, or looking for a library.

MrPatben8 commented 2 months ago

Hey guys, I cloned and compiled a version from https://github.com/mixxxdj/mixxx/pull/13123 but unfortunately loading a stem file into a deck does not seem to activate the on screen STEM controls. Perhaps I'm missing something?

JoergAtGithub commented 2 months ago

The CMake settings FFMPEG and STEM needs to be enabled for the build.

MrPatben8 commented 2 months ago

The CMake settings FFMPEG and STEM needs to be enabled for the build.

I found and enabled the FFMPEG setting but couldn't find STEM. I tried adding it manually but that hasn't worked. Compiled builds still do not show STEM controls. I can see that in the CMakeLists.txt there is a section dedicated to STEM file support so I'm fairly confident that I'm on the correct branch.

JoergAtGithub commented 2 months ago

I found and enabled the FFMPEG setting but couldn't find STEM.

https://github.com/mixxxdj/mixxx/blob/a0a8d258a5b74c80c7135d396c1eba3503793108/CMakeLists.txt#L3442

JoergAtGithub commented 1 month ago

For reference, the new Traktor 4 contains some minor changes compared to Traktor 3s stem controls. Loading the original non-stem track when loading a track while shift button is pressed: https://www.youtube.com/watch?v=WRcLyFmZLP0 The main addition is that you can now call a (slow) stem seperator proccess for a single file from the library menu. This creates a stem file in a cache folder.

axeldelafosse commented 1 month ago

Hey guys! Just found this. This is awesome, thank you so much for your hard work!

Let me know if you need help with beta testing and/or code reviews. Feel free to send me an email :)

Loading the original non-stem track when loading a track while shift button is pressed

Yeah it's pretty useful to sync the original file and the stem file to keep the metadata in sync (grid, cue points, etc...) -- is this something you already thought about? Not a big fan of Traktor's implementation because you cannot link your own stem files easily, only those created by Traktor. But it's great to be able to do the stem separation from the explorer (even if they don't support lossless stems yet). It would also be nice to co-locate the original file and the stem file instead of putting the stem files in another folder with a weird architecture.

JoergAtGithub commented 1 month ago

We might need also [ChannelXStemY]VuMeter to support controller mappings like this: https://www.youtube.com/watch?v=puyfYjgspI4&t=126s

Eve00000 commented 1 month ago

When I watched the video (and played with T yesterday to see the functionality of the F1) I thought about that to. Do you want some F1's as well ?:-) They are great toys in combination with Mixxx

acolombier commented 1 month ago

Added *_vu_meter COs and derivative.

Eve00000 commented 2 weeks ago

As you could use the files that gave problems importing them in Mixxx in the screenshots of the newspost... What was the problem? How did you solve it? (I am glad you could solve it)

acolombier commented 2 weeks ago

The problem is still here, but this is a debug assert, so you may disable asserts on your build. The issue is related to race condition between library display and the analyser, with the assert should protect you against a real crash.

Eve00000 commented 2 weeks ago

ah ok. I thought you found a solution.