Open falaktheoptimist opened 5 years ago
I didn't totally understand that part of their paper, since their hop length is also 32ms (hop length 512 / frame rate 16000). Maybe they used 16ms hop first and then didn't update that part of the paper later when they decided to use 32ms hop.
As far as the 3
means in the quoted code, it's just a code to specify the onset in a combined byte array to save some runtime memory. The actual onset/offset/frame data is decoded during training.
Firstly, thank you so much for your super useful implementation of onset and frames model in pytorch. It has been valuable to understanding the paper and also in our project. I was wondering about the lengths of the onset in the labels which is 1 frame as per the implementation https://github.com/jongwook/onsets-and-frames/blob/007980ad3be4f49250ee6dbdb34ae19a4ab49104/onsets_and_frames/dataset.py#L116 However, the onset frames method mentions that
In this case, would making this to 2 help? (Either from here or doubling the ONSET_LENGTH constant). I was curious also as it took the model about 6k steps using the Maestro dataset to come up with onsets (not surprising since they would be sparse across samples) - it just predicted frames before and no onsets. Wanted to know your take on the values.
Thanks.