CQT question - Githubissues

Hi, thanks for the question!

Changing the input features requires to retrain the neural network from scratch. In NeuralNote, the trained model from basic-pitch (model here) is used, and they did not open-sourced the training code AFAIK. But with some work it should be possible to retrain everything following the paper.

But I'm not sure a DFT would work well as input feature here, because convolutions work well on evenly spaced data, and here that means in the note space (each bin separated by a fraction x of a semitone). So the DFT, with linearly spaced bins in frequency, wouldn't fit here I think.

One point I didn't mention in the readme that also prevents NeuralNote from being real-time is the note creation process from the outputs of the CNN. The signal is processed backward at some points, so all of this is far from being causal. See Lib/Model/Notes.cpp for more details. But it might be possible to create a causal algorithm that does this.

A latency of ~0.5 seconds is still way too high for real-time applications, to be able to dub an instrument with a midi synth for example.

To conclude I'd say that making basic-pitch real-time would require a lot of work and research!

DamRsn / NeuralNote

CQT question #27