Closed Waffled-II closed 4 years ago
AI doesn't "pick things out", it imagines what it thinks it should sound like.
The artifacts you hear is essentially the results of the AI imagining what it should sound like based on it's training.
The way I think it works is the neural network is a correlation machine. It correlates all different frequencies with each other, so when a combination of frequencies and their amplitude at a specific moment in time are present, it imagines that paino should sound like "this" and vocals should sound like "that".
I'm still new to ML but that's my impression of it.
Hi @Waffled-II
Separation is never perfect and what you are referring to is called separation artefacts
and is indeed a byproduct of spleeter.
When Spleeter separates stuff, sometimes areas can sound muddy or like something's "missing", even compared to an original instrumental version of the song, something just seems off, is this due to Spleeter not being able to pick things out and leaving artifacts, or is this just how sound works, like certain frequencies get cut out and no matter how good neural networks get, there will always be a "dirty" sound to them if that makes sense.
Like the piano stem for example, sometimes it sounds really warped when it's isolated (and normally the rest of the song would have a lot of other instruments playing as well), would this be a thing that could eventually sound good, clear, and consistent with improvements to spleeter or is this something that will never be fixable.