Model architecture documentation

tsurumeso / vocal-remover

Vocal Remover using Deep Neural Networks

MIT License

1.55k stars 222 forks source link

Hi, Thank you for this amazing code. I just wanted to ask if it's possible for you guys to add a small description of what model architecture you have used and what's the reasoning. As a beginner, I'm finding it difficult to follow through the code. I know it's a lot to ask but please do provide some documentation if feasible for you guys.

I'm pretty familiar with this code at this point (though I'm not an expert). From what I've gathered, this uses an encoder-decoder model architecture and the model structure is a simple U-Net.

I hope this helps!

tsurumeso / vocal-remover

Model architecture documentation #84