anthonio9 / penn

Pitch Estimating Neural Networks (PENN)
MIT License
0 stars 0 forks source link

Guitar to Multi Hot Piano #6

Open anthonio9 opened 6 months ago

anthonio9 commented 6 months ago

Instead of using 6* 1440 bins on the output use one vector that will represent all string together. With that in mind use below as well:

Work work work!

anthonio9 commented 6 months ago

There are multiple ways to achieve this:

Let's try both.

But then, how should this be evaluated? The training part is easy, cause the network will progress with the loss function, however real evaluation is a bit heavy. At first all evaluation should be ditched, only the training loss will matter and the plots with logits and the ground truth.

anthonio9 commented 6 months ago

This log-sum-exp trick may be helpful for enhancing the loss function with sigmoid.. but first, how to apply sigmoid with binary_cross_entropy so that cuda.amp.GradScaler works fine?

EDIT: it seems that using binary_cross_entropy_with_logits is good enough for replacing sigmoid and bce.

anthonio9 commented 6 months ago

1st approach is partially implemented, plotting should not show any ground truth yet, and the only metric tested is the loss coming from the loss function. It's a small step forward!

anthonio9 commented 5 months ago

So how should I go about the post-processing? Here's one way, simply by finding up to 6 peaks, cause torch.topk() does not really work well for this application: https://discuss.pytorch.org/t/pytorch-argrelmax-or-c-function/36404/2

anthonio9 commented 5 months ago

Finding peaks was a struggle and is not perfect at all. I think now is a good time to abandon this idea.

Image