anthonio9 / penn

Pitch Estimating Neural Networks (PENN)
MIT License
0 stars 0 forks source link

Guitar to Multi Hot Piano #6

Open anthonio9 opened 10 months ago

anthonio9 commented 10 months ago

Instead of using 6* 1440 bins on the output use one vector that will represent all string together. With that in mind use below as well:

Work work work!

anthonio9 commented 10 months ago

There are multiple ways to achieve this:

Let's try both.

But then, how should this be evaluated? The training part is easy, cause the network will progress with the loss function, however real evaluation is a bit heavy. At first all evaluation should be ditched, only the training loss will matter and the plots with logits and the ground truth.

anthonio9 commented 10 months ago

This log-sum-exp trick may be helpful for enhancing the loss function with sigmoid.. but first, how to apply sigmoid with binary_cross_entropy so that cuda.amp.GradScaler works fine?

EDIT: it seems that using binary_cross_entropy_with_logits is good enough for replacing sigmoid and bce.

anthonio9 commented 10 months ago

1st approach is partially implemented, plotting should not show any ground truth yet, and the only metric tested is the loss coming from the loss function. It's a small step forward!

anthonio9 commented 9 months ago

So how should I go about the post-processing? Here's one way, simply by finding up to 6 peaks, cause torch.topk() does not really work well for this application: https://discuss.pytorch.org/t/pytorch-argrelmax-or-c-function/36404/2

anthonio9 commented 9 months ago

Finding peaks was a struggle and is not perfect at all. I think now is a good time to abandon this idea.

Image