A new feature branch is added to house note-based embedding, where each note is one-hot encoded with two positional encodings overlayed on top of each other, one corresponding to the start time, and the other the end time. However, initial results look dim:
Despite increased number of layers, and higher data density, we are not seeing any noticeable decrease in training loss nor an increase in validation accuracy. Implementation of the positional encoding or the approach needs to be revised. We might need to potentially drop this in favor of CRNN, which at this point seem to produce much more favorable results.
A new feature branch is added to house note-based embedding, where each note is one-hot encoded with two positional encodings overlayed on top of each other, one corresponding to the start time, and the other the end time. However, initial results look dim: Despite increased number of layers, and higher data density, we are not seeing any noticeable decrease in training loss nor an increase in validation accuracy. Implementation of the positional encoding or the approach needs to be revised. We might need to potentially drop this in favor of CRNN, which at this point seem to produce much more favorable results.