I was able to run the model and get a numpy array that seemed to range from 0 to 255.
I tried storing it as a midi file the frequency range was totally off - I tried normalizing by subtracting 100 or dividing by 4 - this brought the result closer to my expectation it seemed to have the shape of the input melody.
The pitch values (i.e., your result[i, 1]) are in the unit of Hz. To convert them to MIDI numbers n, a conversion like n=12*np.log2(result[i, 1]/440)+69 is needed.
I was able to run the model and get a numpy array that seemed to range from 0 to 255. I tried storing it as a midi file the frequency range was totally off - I tried normalizing by subtracting 100 or dividing by 4 - this brought the result closer to my expectation it seemed to have the shape of the input melody.
FYI I used mp3 input with sf 44100
here is the melody_extraction code I used
def melody_extraction(infile, outfile):
melody_extraction(‘path1/input.wav’, ‘path2/output.txt’)