Open ZFTurbo opened 9 months ago
Model Type: bs_roformer Description: My first five days in model training, training on three servers with GPU T4 x 2 Instruments: vocals, drums, bass Dataset: musdb18hq
How to run [example for vocals]: Download config and checkpoint, save to folder with @ZFTurbo training code, run this command:
python inference.py --model_type bs_roformer --config_path config_musdb18_bs_roformer_vocals.yaml --start_check_point model_vocals_bs_roformer_ep_5_sdr_8.0972.ckpt --input_folder input/ --store_dir separation_results/
Input files in input folder, results in separation_results folder
DEMO: Enjoy this link: https://disk.yandex.ru/d/zc5Bca9nTuB7jg < old
Vocals:
SDR: 8.09
< 7.55
Config link: https://disk.yandex.com/d/eTOZ9BGpTIRNYw
Checkpoint link: https://disk.yandex.com/d/wPdwPZTQMJfAZQ
Drums:
SDR: 7.22
< 7.15
Config link: https://disk.yandex.com/d/ab8glguWFltifA
Checkpoint link: https://disk.yandex.com/d/auEl3aovvWhYMw
Bass:
SDR: 5.78
< 5.28
Config link: https://disk.yandex.com/d/mgqAPCahZQwEgQ
Checkpoint link: https://disk.yandex.com/d/BISwCadSNyYb-g
Last update: 21.07.24
Model Type: mel_band_roformer Description: My first attempt at training, trained for 5 days on 4070 Instruments: percussions Dataset: musdb18hq, moisesdb SDR value based on musdb18hq's test set
SDR: 6.86 SDR: 7.10 SDR: 7.44 SDR: 7.68 Checkpoints: https://disk.yandex.ru/d/MxZ4k-kZ2Q5QqA
Updatet on: 14.06.2024 18:20:00 UTC+3
Model Type: mel_roformer Description: my first attempt at training Instruments: timpani Dataset: mvsep
Model Type: mel_band_roformer Description: My first ai training Instruments: percussion Dataset: mvsep
@alexclarke236 Would you like to share the checkpoints you've trained ? Best way is to host them on a file-sharing site and post the link here like previous users have done.
Yes
On Tue, Jun 11, 2024 at 8:15 PM Jarredou @.***> wrote:
@alexclarke236 https://github.com/alexclarke236 Would you like to share the checkpoints you've trained ? Best way is to host them on a file-sharing site and post the link here like previous users have done.
— Reply to this email directly, view it on GitHub https://github.com/ZFTurbo/Music-Source-Separation-Training/issues/1#issuecomment-2161829813, or unsubscribe https://github.com/notifications/unsubscribe-auth/BJBMRZEKBP32EHV2GY5IYCDZG6HIVAVCNFSM6AAAAAA67RO3LWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRRHAZDSOBRGM . You are receiving this because you were mentioned.Message ID: @.***>
Architecture: MDX23c Description: My first somewhat successful attempt at training. Hardware used was my personal RTX 3060 12gb, 64gb DDR4 RAM, Ryzen 5 5600X, Windows 11. Training stopped due to the inconvenience of training on my personal machine and the slow speeds at which the training was progressing. Had I had the funds, I would've rented a GPU from vast.ai. Trained for a total of 208 epochs or roughly ~2,500 minutes. Instruments: Strings (Cello, Double Bass, Violin, Viola), Brass (Trumpet, English Horn, Tuba, Trombone), Wind (Piccolo, Flute, Clarinet, Saxophone), Mellotron Flute & Cello. Other instruments that have a similar quality or sound may be present in the dataset but unaccounted for. Dataset (if known): Custom 97 pair dataset using tracks from isolated-tracks.com, songstems.net, MoisesDB, ARME-Virtuoso-Strings-2.2, traditional-flute-dataset, a bunch of Toby Fox FLP fan recreations and a Dolby atmos rip of the center track of Eleanor Rigby that was layered over a song from MoisesDB. Metrics (if known): SDR 4.4174 on my very small validation set. Performance of the model depends heavily on the input. Config link: https://drive.google.com/file/d/1OTuF3534Ax5SJSsk08e2QLgoxiljqelH/view?usp=sharing Checkpoint link: https://drive.google.com/file/d/1juOW6Q_Puqp_uxMsQpWSWkAm1QSbXIdg/view?usp=sharing
Same model as above, but trained for a further 54 epochs. Sounds better to the ears than the older model in quite a few cases, but scores a lower SDR on the validation set. Picks up wind instruments better than older model in my testing and has the chance to pick up string sections better too. Instruments: Same as above Dataset (if known): Same as above Metrics (if known): SDR 4.0870 on my very small validation set. Performance of the model depends heavily on the input. Config link: https://drive.google.com/file/d/1OTuF3534Ax5SJSsk08e2QLgoxiljqelH/view?usp=sharing Checkpoint link: https://drive.google.com/file/d/1gB6RPUw_knozcY3qF--cpTczoxpkDw5O/view?usp=sharing
Edit: 3/08/2024, currently retraining this model with a larger dataset but using same machine, so will take a while. Results will be posted here if it gets anywhere decent
Description: MDX23C Drums elements separation model (to apply on drums-only audio) n_fft = 2048 instead of default 8192 was used for more lightweigted required resources. Baseline training (141 epochs) was done by @aufr33, not fully finished, it can be improved.
Instruments: kick, snare, toms, hh, ride, crash
Dataset: created by myself for that task, but had some issues.
Metrics: Instr SDR kick: 18.4312 Instr SDR snare: 13.6083 Instr SDR toms: 13.2693 Instr SDR hh: 6.6887 Instr SDR ride: 5.3227 Instr SDR crash: 7.5152 SDR Avg: 10.8059:
Config & checkpoint : https://github.com/jarredou/models/releases/tag/aufr33-jarredou_MDX23C_DrumSep_model_v0.1
@anvuew thank you for great model. I have a question about your training. Did you apply reverb for full tracks or for vocal part only? Can you share your validation? I'd like to compare your model with older ones.
@anvuew thank you for great model. I have a question about your training. Did you apply reverb for full tracks or for vocal part only? Can you share your validation? I'd like to compare your model with older ones.
noreverb is vocal only. valid is too inadequate to share. the MDX Reverb-HQ SDR is 6.5 for my valid.
For those who have issues running Roformers from this thread in UVR, you must delete the following line from the YAML file:
linear_transformer_depth: 0
bs_roformer dereverb model chunk_size: 352768 dim: 256 depth: 8
SDR noreverb: 8.0770(small valid)
Config link: config Checkpoint link: ckpt
Although this is a dereverb model, it will also remove harmonies or vocal effects that are not in the center channel.
If you want to add this model to UVR5, first place the config file and weights in the corresponding directory (weights in Ultimate Vocal Remover\models\MDX_Net_Models
, config file in Ultimate Vocal Remover\models\MDX_Net_Models\model_data\mdx_c_configs
). Delete linear_transformer_depth: 0
from the config file and change stft_hop_length: 512
to stft_hop_length: 441
. then open UVR5, you will be prompted to add the model (select the MDX architecture if not). Choose the corresponding config file and check the Roformer model
checkbox.
Congratulations on these fine dereverb models! I just tried using the bs roformer dereverb model inside UVR, and got this error: RuntimeError: "The size of tensor a (352768) must match the size of tensor b (352800) at non-singleton dimension 1 Weirdly, the mel band roformer one works fine. But since bs has a slightly better sdr, I wanted to see if it was worth switching to that.
i need model separation include sdr vocals after separation higher than sdr vocals 12.97 pleasse
BS Roformer (viperx) is 12.9755?it is old.The latest one is BS Roformer (finetuned) shown in the image,but where to download? who can tell me?
If you mean 2024.03 model, t's the same. 12.97 metric comes from private validation dataset which wasn't multisong dataset. All newer Roformers with better metrics are currently not public, so cannot be downloaded.
niedz., 28 lip 2024 o 17:00 lgkt @.***> napisał(a):
BS Roformer (viperx) is 12.9755?it is old.The latest one is BS Roformer (finetuned) shown in the image,but where to download? who can tell me? image.png (view on web) https://github.com/user-attachments/assets/81202ec1-e2da-4206-be19-5b216533a0f8
— Reply to this email directly, view it on GitHub https://github.com/ZFTurbo/Music-Source-Separation-Training/issues/1#issuecomment-2254547771, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIJ3EHDRXHB5XXSJTCARKUDZOUBP5AVCNFSM6AAAAAA67RO3LWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJUGU2DONZXGE . You are receiving this because you are subscribed to this thread.Message ID: @.*** com>
To post your model, please, fill the form: