Open tomhaydn opened 6 months ago
I've only done tests with unconditional training, because i didn't needed conditioning on text, i just wanted to condition on artist's ID (like it was done in OpenAI's JukeBox), but i haven't figured it out, so i've gone with unconditional training.
One thing that caught my eye immediately is that you've specified training in 48kHz in your dset config file
datasource:
max_sample_rate: 48000
max_channels: 2
The problem is that, as far as i know, FAIR didn't published Encodec weights for 48kHz, or if they did, you'll need to tweak more thing to make it work. So you might want to just set it to 32000. Your dataset is mostly 44100 anyways. +the artifacts of quntization -> dequantization that Encodec introduces are far more impactful than loosing some of these upper end frequencies, due to hearing decay related to aging, bad headphones/speakers/DAC and so on... I've covered this in the other issue here about Encodec being 32kHz instead of 44.1kHz, so you can find more info there.
You also gonna have easier time having things to start training if you're will be training in mono, because it also requires additional tinkering to set up training in stereo, so just set max_channels: 1
.
and finally, I have my dataset and data:
dataset/test_1/train/data.jsonl dataset/test_1/test/data.jsonl
Huh, interesting, because i don't have these data.jsonl files in my dataset folder. Also because i train uncond, i also don't have json files with the info on BPM, description and so on, but that's unrelated. I mean, i just have a folder full of mp3's. There's 2 things that might've happened 1) Maybe you've mistyped it and it actually is located in
egs/test_1/train/data.jsonl
? 2) Or you've placed your data.jsonl file in the wrong place, it should't be indataset/test_1/train/data.jsonl
, move it intoegs/test_1/train/data.jsonl
, do the same for valid and others too.
If that's the second case, then it doesn't know where to look for the audio files. Try everything that i've suggested, and it'll probably run without any issues. But if it's not, i'll try to assist you further with that problem.
My issue was that I wrote a script separately to generate the manifest file (and subsequent train test split), without realising that there was already a built-in, duh. The problem was that some sample rates were incorrect.
!python -m audiocraft.data.audio_dataset audiocraft/dataset/test_1 egs/test_1_new/data.jsonl
Maybe you've mistyped it and it actually is located in egs/test_1/train/data.jsonl ?
This was correct, my mistake.
My issue was that I wrote a script separately to generate the manifest file (and subsequent train test split), without realising that there was already a built-in, duh. The problem was that some sample rates were incorrect.
!This, i've done exactly the same thing, my DIY script have worked, but i realized that it was not needed after i already have started the training process, and i started to re-read the docs from the ground up again, just in case that i've missed something. That's how i discovered about that there was built-in solution.
Hi, I'm trying to train a new model from scratch via musicgen on a new dataset. I'm finding that the docs are quite difficult to follow.
Please see my folder structure and approach:
The command to initiate training
!cd "audiocraft" && dora run -d solver=musicgen/musicgen_base_test_1 dset=audio/test_1
I have configured my custom solver
config/solver/musicgen/musicgen_base_test_1.yaml
I have my dset config
dset/test_1.yaml
and finally, I have my dataset and data:
dataset/test_1/train/data.jsonl
dataset/test_1/test/data.jsonl
both of these look like this:
each audio file has a 'manifest' file in the form:
{"key": "A#", "artist": "Alec K. Redfearn & the Eyesores", "sample_rate": 44100, "file_extension": "mp3", "description": "Folk", "keywords": ["Folk"], "duration": 182, "bpm": 103, "genre": "Folk", "title": "Ohio", "name": "Ohio", "instrument": "mix", "moods": ["Folk"]}
I can adjust this as needed, but I want to get training working before I mess with parameters.
Everything runs fine then hits an error:
Thanks in advance for any help with this particular issue and would appreciate any general tips for something else I might be doing wrong. I really want to get a working model that isn't restricted by the license