acolombier commented 7 months ago

This is an epic issue to track all the work on STEM mixing.

This issue will also be used to keep track of all possibilities about STEM mixing. This may include competitors or original features

Note that this epic only plans to build on top of the open standard created by Native Instrument. While the implementation will aim to support extended case (e.g MP3 or OGG format for channel) , but guidelines will be taken from the open standard.

Tasklist

[x] #13044
[x] #13070
[x] #13106
[x] #13086 (Manual)
[x] #13123
[ ] #13653 (contains the STEM implementation as well)
[x] #13143
[ ] #13268

This is the list of planned COs

COs	Description	Status
`[ChannelX],stem_count`	The number of available stem on the deck (read-only, 2..4)	Implemented in #13086
`[ChannelXStemY],volume`	Adjust the gain of stem #Y (0..1)	Implemented in #13086
`[ChannelXStemY],color`	The colour of stem #Y (read-only, 32-bit RGB)	Implemented in #13086
`[ChannelXStemY],mute`	Mute the stem #Y (0 or1)	Implemented in #13086
`[QuickEffectRack1_[ChannelXStemY]],loaded_chain_preset`	Load the chain preset to the stem #Y	Implemented in #13123
`[QuickEffectRack1_[ChannelXStemY]],enabled`	Enable the chain preset to the stem #Y	Implemented in #13123
`[QuickEffectRack1_[ChannelXStemY]],super1`	Adjust the 1st super parameter on the chain preset for the stem #Y	Implemented in #13123
`[ChannelXStemY],split`	Split the stem #Y into the sibling deck if empty.	Not implemented
`[ChannelXStemY],split_into_Z`	Split the stem #Y into the specified deck #Z if empty.	Not implemented
`[ChannelXStemY],vu_meter`	Outputs the current instantaneous stem #Y volume	Not implemented
`[ChannelXStemY],vu_meter_l`	Outputs the current instantaneous stem #Y volume for the left channel	Not implemented
`[ChannelXStemY],vu_meter_r`	Outputs the current instantaneous stem #Y volume for the right channel	Not implemented
`[SamplerX],load_selected_track`	Load a selection of stem from the selected file as stereo track into the sampler #X	Blocked in #13268
`[PreviewDeck1],load_selected_track_stem_Y`	Load a selection of stem from the selected file as stereo track into the preview	Blocked in #13268
`[ChannelX],load_selected_track_stems`	Load a selection of stem from the selected file as stereo track into the channel #X	Blocked in #13268
`[LibraryStemX],selected`	Indicate whether or not the stem Y is selected by user for load operation	Blocked in #13268

(Currently Y goes from 1 to 4)

Limitations

Rubberband audio scaling is currently the bottleneck and seem to not allow more than 2 stem decks at a time. When using linear or SoundTouch, all four decks can be used as stem, although CPU usage is significantly increasing compare to stereo decks.

Usecase log

Split a stem deck

When splitting a stem from a deck, the deck will be cloned to the sibling deck (A and C, or B and D), and the given stem will be muted on the current deck, and played solo on the sibling deck.

Deck coping and cloning

Here the expected behaviour from the two existing COs when targeting a deck with stem loaded

[SamplerN]LoadTrackFromDeck: Will load the main mix, that is the pre-mixed track included in the stem file
[ChannelN]CloneFromDeck: Will clone the stem parameters values from the original deck, that is the gain and mute status

Pre-mixed stereo load

Design 1

load_selected_track_stems can be used as followed:

0 no individual stem selected. File will be loaded as a stem if on a primary deck, or the mixed stem in a secondary decks (sampler or preview, same as 0b1111/15)
0b0001 means only the first stem is selected for stereo pre-mixing.
0b1101 means all stem but the second are selected for stereo pre-mixing.

Design 2

[LibraryStemX] can be used to pre-select stem to load, before calling <group>,LoadSelectedTrack. Note that the selection state will be persisted after loading, meaning that multiple call to LoadSelectedTrack will lead to the same section. No selection will default to load the track as a stem deck if supported, or the entire track as stereo if stem deck is not supported.

Relates to #7935

Dependency of #11391

JoergAtGithub commented 7 months ago

I'm thinking about the proposed stem_X_split CO. There are so many use cases, and it's difficult, maybe impossible, to cover them with COs. Some considerations about stem split mode and this CO:

I wonder if split mode is always only needed with only a single stem split to the other deck, I could imagine also the use case, that a DJ want to have Vocal and Melody on one deck and Bass and Drum on the other. An approach could be binary encoding of the stem number, something like [Channel1], "SplitStemsToDeck3", 0b0101
The current CloneToDeck CO copies the play state, position, rate, and key. I wonder if the use case for stems deck splitting would require to clone also the mixer channel settings. I guess that if a DJ clones stems while playing a track, he want not audible change in the moment of spliting.
Is there a need for a Merge-Decks CO as well? For the case that the DJ want to use the deck in normal mode again.
Should a deck with only one stem have stem controls?

acolombier commented 7 months ago

This issue with this is how easy it would be to map, especially on MIDI devices or with the least amount of boilerplate code. I think what we could do is make the stem_X_split additive, so say you trigger [Channel1],stem_1_split, Deck C now plays the first stem while, A continues playing stems 2 to 4. Now you [Channel1],stem_4_split again, assuming deck A and C are still aligned, C now plays the first and last stem, while A plays the second and third. This should make it pretty easy to map. It's worth to say that stem will only be muted on other decks so you may also manually mute/unmute other stems, say once you've stopped the playback on A or C with the previous example
Good point - I'd say probably yes. Potentially something to make configurable? I can imagine a case where you want to split on deck where you've already pre-adjusted the volume or EQ.
Probably not, as merging would mean unmuting the previously split stem.
The control will be created but inactive. I realise I forgot one CO which is stem_count. This can be used to programmatically detect whether the deck contains stems (0 means no stem or only a main mix, 2 or more is a stem count). I think it makes more sense to have them created all the time, up to the maximum supported stem count (4) so this prevent cases where COs are in-existing or not yet active. Like for the EQ, we might want to have an auto reset feature tho, so volume and effects get back to default on track load.

JoergAtGithub commented 7 months ago

Should these COs only work for main decks, or also for preview decks and samplers? The later would be difficult to visualize in the UI. But I could imagine the use case, that a DJ want's to clone a single stem into a sampler.

Regarding 1. :

Should it be possible to specify a deck numer to clone by the value of the CO, or should only the sibling deck be supported?
Is stem_X_split really describing, what the CO does? Wouldn't stem_X_move_to_deck be more descriptive? Or maybe defined the other way around stem_X_move_from_deck, to be more consistend with the existing clone and load functions?

acolombier commented 7 months ago

Should these COs only work for main decks, or also for preview decks and samplers?

I don't think so. I don't really see the use case for having stem control in these component

I could imagine the use case, that a DJ want's to clone a single stem into a sampler

I didn't think of this but that sounds like a great usecase indeed. Currently, the expected behaviour of loading a stem file in preview or sampler would lead to load the main mix. I'm going to add COs in the plan for loading a stereo track but target a specific stem. Question: is there a naming convention with COs? I went with snake case, but I can see that LoadSelectedTrack isn't following this naming convention.

Regarding 1. :

I guess we could generate specific COs like stem_X_split_to_deck_a as well, which may be used to be more specific and offer greater flexibility, but I think having a default to sibling deck will likely be best for easy mapping.
I agree it may not be very specific, while move could also be misleading, as you don't move a specific stem but the whole track in practice. I guess the split version has the advantage to use a similar terminology than Rane?

JoergAtGithub commented 7 months ago

Please use snake case for new COs, LoadSelectedTrack is legacy naming.

Is the extra CO [ChannelX],stem_Y_split_into_Z really needed? The value of [ChannelX],stem_Y_split is currently unused, why not define it as follows:

0 Split the stem #Y into the sibling deck if empty.
1...4 Split the stem #Y into the specified deck 1...4 if empty.

acolombier commented 7 months ago

Not particularly needed, but I was under the impression that for MIDI device mapping, having atomic CO which can be triggered is easier for user, seeing some of the latest COs that were added.

JoergAtGithub commented 7 months ago

Besides the new COs, it is also important to consider which of the existing COs will behave differently with stem files, e.g.: [SamplerN]LoadTrackFromDeck Should it reject the operation or load the master mix [ChannelN]CloneFromDeck Should the stem parameter be cloned?

acolombier commented 7 months ago

[SamplerN]LoadTrackFromDeck Should it reject the operation or load the master mix

Currently, it loads the master mix which I think make sense

[ChannelN]CloneFromDeck Should the stem parameter be cloned?

I think it would make sense to clone stem gain and mute state, probably not effect tho. I'm going to add this to #13086

Going to this to Usecase above

JoergAtGithub commented 7 months ago

I profiled the CPU consumption of the engine thread with a stem deck playing and Rubberband R3 running. It seems all CPU is burned in the Rubberband code itself and neither in Mixxx nor in the dependencies of Rubberband:	Function Name	Total CPU [unit, %]	Self CPU [unit, %]
\| - RubberBand::GuidedPhaseAdvance::advance	1158 (20,52 %)	952 (16,87 %)	rubberband-2
\| - RubberBand::MovingMedian\<double>::dropAndPut	681 (12,07 %)	674 (11,94 %)	rubberband-2
\| - RubberBand::MovingMedian\<double>::push	1214 (21,51 %)	502 (8,90 %)	rubberband-2
\| - calc_output_multi	367 (6,50 %)	364 (6,45 %)	samplerate
\| - [External Call]vcruntime140.dll!0x00007fffc79310bb	173 (3,07 %)	173 (3,07 %)	vcruntime140
\| - RubberBand::MovingMedian\<double>::get	114 (2,02 %)	114 (2,02 %)	rubberband-2
\| - RubberBand::BinClassifier::classify	1453 (25,75 %)	102 (1,81 %)	rubberband-2
\| - RubberBand::v_multiply_and_add\<double>	93 (1,65 %)	93 (1,65 %)	rubberband-2
\| - RubberBand::R3Stretcher::synthesiseChannel	877 (15,54 %)	90 (1,59 %)	rubberband-2
\| - but8b_0_avx2dp	84 (1,49 %)	84 (1,49 %)	sleefdft
\| - RubberBand::R3Stretcher::analyseChannel	2711 (48,04 %)	83 (1,47 %)	rubberband-2
\| - but8f_0_avx2dp	83 (1,47 %)	81 (1,44 %)	sleefdft
\| - RubberBand::v_multiply\<double>	75 (1,33 %)	75 (1,33 %)	rubberband-2
\| - RubberBand::HistogramFilter::filter	75 (1,33 %)	74 (1,31 %)	rubberband-2
\| - [External Call]ucrtbase.dll!0x00007fffe0d81aff	74 (1,31 %)	74 (1,31 %)	ucrtbase
\| - [External Call]ucrtbase.dll!0x00007fffe0d81b14	71 (1,26 %)	71 (1,26 %)	ucrtbase
\| - RubberBand::FFTs::D_SLEEF::forward	289 (5,12 %)	70 (1,24 %)	rubberband-2
\| - [External Call]ucrtbase.dll!0x00007fffe0d190fe	69 (1,22 %)	69 (1,22 %)	ucrtbase
\| - RubberBand::v_convert\<float,double>	65 (1,15 %)	64 (1,13 %)	rubberband-2
\| - realSub1_avx2dp	50 (0,89 %)	50 (0,89 %)	sleefdft
\| - RubberBand::FFTs::D_SLEEF::inverse	318 (5,64 %)	48 (0,85 %)	rubberband-2
\| - tbut8b_0_avx2dp	45 (0,80 %)	45 (0,80 %)	sleefdft
\| - [External Call]ucrtbase.dll!0x00007fffe0d1a36d	44 (0,78 %)	44 (0,78 %)	ucrtbase
\| - [External Call]ucrtbase.dll!0x00007fffe0d1a2de	37 (0,66 %)	37 (0,66 %)	ucrtbase
\| - RubberBand::BinSegmenter::segment	110 (1,95 %)	35 (0,62 %)	rubberband-2
\| - tbut4b_0_avx2dp	35 (0,62 %)	34 (0,60 %)	sleefdft
\| - realSub0_avx2dp	34 (0,60 %)	34 (0,60 %)	sleefdft
\| - tbut4f_0_avx2dp	33 (0,58 %)	33 (0,58 %)	sleefdft
\| - EngineDeck::processStem	5399 (95,68 %)	32 (0,57 %)	mixxx
\| - EngineEffectsDelay::process	29 (0,51 %)	29 (0,51 %)	mixxx
\| - dft8b_0_avx2dp	29 (0,51 %)	29 (0,51 %)	sleefdft
\| - [External Call]ucrtbase.dll!0x00007fffe0d18efa	29 (0,51 %)	29 (0,51 %)	ucrtbase
\| - RubberBand::MovingMedian\<double>::getSize	26 (0,46 %)	25 (0,44 %)	rubberband-2
\| - dft8f_0_avx2dp	21 (0,37 %)	21 (0,37 %)	sleefdft
\| - [External Call]ucrtbase.dll!0x00007fffe0d18fa9	21 (0,37 %)	21 (0,37 %)	ucrtbase
\| - RubberBand::R3Stretcher::convertToPolar	365 (6,47 %)	20 (0,35 %)	rubberband-2
\| - tbut8f_0_avx2dp	20 (0,35 %)	20 (0,35 %)	sleefdft

These functions contain large loops with break conditions. Break-Conditions make the loop length unpredictable, and the compiler can't translate it into SIMD instructions.

acolombier commented 7 months ago

Thanks for running these tests! I have put together a PoC on multithreaded rubberband and the results are extremely promising on my end. Could you please have a go at it and tell me if you see an clear improvement?

napaalm commented 7 months ago

Hello, is this implementation independent from the stem file format? For example, in the future it would be useful to use multiple FLAC / WAV files, or multiple channels in a single FLAC file.

JoergAtGithub commented 7 months ago

The scope of this project is limited to STEM files, but in future other soundsources might be added. You can use converters, like the free NI Stem Creator tool (https://www.stems-music.com/stem-creator-tool/) to convert your files to the STEM format. Or, what is more important, you can use Stemgen ( https://stemgen.dev/ ) to split normal tracks into stems.

acolombier commented 7 months ago

is this implementation independent from the stem file format?

Yes and no. The implementation is made so we are not restricting STEM to support only AAC and ALAC as the NI spec specify, but instead support any format for the STEM, as long as they are consistent in a same file.

it would be useful to use multiple FLAC / WAV files

That's not that simple, due to synchronisation across track. The NI stem is made so it guarantees the same bitrate and ensure synchronisation across track (remember that STEM have not transport control and are expected to provide a synchronised sample rate naturally)

multiple channels in a single FLAC file

That would work. Only issue would be to deal with stem definition.

Note that implementation aim to be as agnostic as possible in the engine; as long as the SoundSource provides multiple channels, and a the MetadataSource provides a stem definition (a pair label/color per stem), this will work fine. The issue is around standardising the stem definition, which might be hard and providing a tool to build or convert a stem in the "extended" NI spec that we support (that is, not restricting to AAC and ALAC) might be the easier way instead of creating yet another standard for STEM. For example, it could take the form of an option that would show in the context menu of a track with more multiple channel, allowing to create a stem.m4a version without NI constraints. (or stemgen)

mxmilkiib commented 7 months ago

Just a note that Mixxx supports tracker module playback, which have multiple channels that get rendered to memory as a stereo pair on load, and it would be cool to level/mute/solo/effect these individually, so this is another consideration for a further future soundsource.

acolombier commented 7 months ago

That would be able to make the most of the stem feature developed here, but the same problem remains and a way to provide stem definition will have to be designed/implemented. From the engine perspective, that has already been taken into account

MrPatben8 commented 4 months ago

Hi guys, I just wanted to show some support for this feature. It's one that I find to be incredibly important since it's one of the few features that has me locked into using Traktor. I'd love to transition to using Mixxx but I simply can't live without this feature. I'm own a Traktor Kontrol S8 and a pair of D2s as well as a S4 Mk2 so if there's anything I can do to help in terms of hardware, just say the word. I've also created a program to split and convert regular mp3s into STEM compatible files using machine learning, which you can find here: https://github.com/MrPatben8/AutoStemReloaded It might come in handy if anyone needs more STEM files for testing.

Good luck!

JoergAtGithub commented 4 months ago

You could try the builds from the PRs mentioned above, like #13123, they are already fully useable, but not yet reviewed and polished. Testing with files generated by another tool will be of cause helpful. What we do not support, are the compressor and limiter settings inside the STEM files. Another area where help is needed, is the cross-platform hardware acceleration for stem seperation inside Mixxx. You might have a look at: https://github.com/dszakallas/stemtools/tree/main An idea is to use onnxuntime to run the stem seperation models with GPU / NPU acceleration on all Mixxx platforms. But this seems to be non-trivial.

acolombier commented 4 months ago

Thanks for the support @MrPatben8 ! This is hopefully in good tracks to make into Mixxx 2.6, which go to beta somewhere around Q4. As @JoergAtGithub suggested, your help on testing the PR would be greatly appreciated too. Currently, only the S4 Mk3 has some basic support for STEM (#13126) For anyone looking for stemgen tools or library, I have also published my own which aim to support any formats and allow a fairly large flexibility with stem adjustment. It also use the latest DEMUCSv4, so I guess it would lead to similar result, but it might be simpler to use for anyone using docker/unix systems, or looking for a library.

MrPatben8 commented 4 months ago

Hey guys, I cloned and compiled a version from https://github.com/mixxxdj/mixxx/pull/13123 but unfortunately loading a stem file into a deck does not seem to activate the on screen STEM controls. Perhaps I'm missing something?

JoergAtGithub commented 4 months ago

The CMake settings FFMPEG and STEM needs to be enabled for the build.

MrPatben8 commented 4 months ago

The CMake settings FFMPEG and STEM needs to be enabled for the build.

I found and enabled the FFMPEG setting but couldn't find STEM. I tried adding it manually but that hasn't worked. Compiled builds still do not show STEM controls. I can see that in the CMakeLists.txt there is a section dedicated to STEM file support so I'm fairly confident that I'm on the correct branch.

JoergAtGithub commented 4 months ago

I found and enabled the FFMPEG setting but couldn't find STEM.

https://github.com/mixxxdj/mixxx/blob/a0a8d258a5b74c80c7135d396c1eba3503793108/CMakeLists.txt#L3442

JoergAtGithub commented 4 months ago

For reference, the new Traktor 4 contains some minor changes compared to Traktor 3s stem controls. Loading the original non-stem track when loading a track while shift button is pressed: https://www.youtube.com/watch?v=WRcLyFmZLP0 The main addition is that you can now call a (slow) stem seperator proccess for a single file from the library menu. This creates a stem file in a cache folder.

axeldelafosse commented 3 months ago

Hey guys! Just found this. This is awesome, thank you so much for your hard work!

Let me know if you need help with beta testing and/or code reviews. Feel free to send me an email :)

Loading the original non-stem track when loading a track while shift button is pressed

Yeah it's pretty useful to sync the original file and the stem file to keep the metadata in sync (grid, cue points, etc...) -- is this something you already thought about? Not a big fan of Traktor's implementation because you cannot link your own stem files easily, only those created by Traktor. But it's great to be able to do the stem separation from the explorer (even if they don't support lossless stems yet). It would also be nice to co-locate the original file and the stem file instead of putting the stem files in another folder with a weird architecture.

JoergAtGithub commented 3 months ago

We might need also [ChannelXStemY]VuMeter to support controller mappings like this: https://www.youtube.com/watch?v=puyfYjgspI4&t=126s

Eve00000 commented 3 months ago

When I watched the video (and played with T yesterday to see the functionality of the F1) I thought about that to. Do you want some F1's as well ?:-) They are great toys in combination with Mixxx

acolombier commented 3 months ago

Added *_vu_meter COs and derivative.

Eve00000 commented 2 months ago

As you could use the files that gave problems importing them in Mixxx in the screenshots of the newspost... What was the problem? How did you solve it? (I am glad you could solve it)

acolombier commented 2 months ago

The problem is still here, but this is a debug assert, so you may disable asserts on your build. The issue is related to race condition between library display and the analyser, with the assert should protect you against a real crash.

Eve00000 commented 2 months ago

ah ok. I thought you found a solution.

mixxxdj / mixxx

Epic: STEM mixing #13116

Tasklist

Limitations

Usecase log

Split a stem deck

Deck coping and cloning

Pre-mixed stereo load

Design 1

Design 2