gudgud96 / syntheon

Parameter inference of music synthesizers to simplify sound design process. Supports Vital.
Apache License 2.0
134 stars 5 forks source link

Wanting to spark a dialogue on this subject #8

Open lanmower opened 4 weeks ago

lanmower commented 4 weeks ago

I'm very interested in this project, not knowing much about the process I blindly tried to use gpt to solve this issue, but your project is much nicer and further down the production line...

I ran a kick through it and it sounded a little bassy, was thiking about starting that as a discussion on how to improve its output, since the transients seem ok and it did pick up something that resembles a sine, that's a good starting point, getting it to pick up the pitch envelope

Do you have any thoughts on how to make it sensitive to that?

lanmower commented 4 weeks ago

Ok so I'm running a batch of sounds on it here, I've got a 1000 analog drum sounds, could we use these as a ground truth to test and improve the output for this kind of input? I can already see some things that appear to be lost in translation...

image

This is an example, it appears that there is a phasing issue at the end of the wavetable it generates in many sounds

image

In short envelope tracks, it appears that it's having trouble picking up the length of the envelope, making them too sort

lanmower commented 4 weeks ago

When looking at my version it really appears like there's something wrong with it, because I see this in the output C:\app\WPy64-310111\python-3.10.11.amd64\lib\site-packages\syntheon\inferencer\vital\models\preprocessor.py:127: FutureWarning: Pass sr=16000 as keyword args. From version 0.10 passing these as positional arguments will result in an error x, sr = librosa.load(f, sampling_rate) C:\app\WPy64-310111\python-3.10.11.amd64\lib\site-packages\librosa\core\convert.py:1332: RuntimeWarning: divide by zero encountered in log10

gudgud96 commented 4 weeks ago

@lanmower Thank you for your interest in this project! I am currently not focusing fully on Syntheon, but happy to discuss about improvements.

For the issue which the kick sounds "bassy" one way is to introduce filter modulation. This might be made possible with a recent related research.

For shorter-than-expected envelope detection, we rely on librosa.onset.onset_detect to detect onsets, and cut out a one-shot sample for further analysis. The onset detection could go wrong, which results in a one-shot too-short most of the time (hence might also affect the resulting wavetable). One way is to migrate to other onset detection / transcription libraries (e.g. Essentia, or BasicPitch), but each comes with its own inaccuracies, and some might not work well on drums.

Happy to have a look at the analog drum sounds too.

lanmower commented 3 weeks ago

I think I get it, after your explanation the output sounds make a lot more sense, a lot of the sounds come out very rich in the treble range, for instance my modified sine input produces like a very sharp output

image

image

bass.zip