Closed 38github closed 3 years ago
Your split_data shouldn't be set to zero, 1 is the lowest and the default, but I don't think that's the issue. For each epoch, I only see 211 batches? And at a batch size of 512 that's only a few seconds of audio. Either your input wavs are too short or they're not getting read in properly. Maybe your DAW is putting in some kind of split or metadata within the files? That's also why the loss is so low, because it's really easy for the network to learn on a few seconds of data.
This example was only a file of maybe two seconds to see if it could emulate just those two seconds. But if the loss is so low shouldn't it easily be able to predict the same input file?
I've actually never tried training on such a short sample, but based on your results it seems like the answer is no. It might be possible to create a model specifically for training on a few seconds of audio, but this model is made to learn on about 3 minutes of audio.
It's an interesting result though, and something that I'm realizing the more I learn about A.I. A.I. isn't a magic button that can tackle anything you throw at it, there is a lot of work that goes into developing a model, testing, optimizing, etc. And even an optimized model is really only meant to solve a specific problem within a specific set of boundaries. It would be interesting to do more testing on where that lower limit is for sample length.
The model you had done on a compressor with pedalnetrt had the compression sounds which makes me sceptical about this method or maybe it is not configured "properly"?
It also does not handle low frequencies any good. They are either abscent or gets distorted.
I have noticed that as well, at least on the samples where I get a higher loss. The bass frequencies seem to be the first to go as the loss goes up.
Going to go ahead and close out this issue since the root problem was figured out. Thanks for the feedback!
I said that it won't model compression so here is a sample. It is a very short sample but I did that so that it could work a lot of just that sound and hopefully get it sounding very close. There is no compression sound in the predicted output.
I used training_mode=0 in this one but it still reached only 0.0029 in loss and I have also tried with 3 and 4. Please let me know what I am doing wrong. Regards.
rnla7230_7.zip