rabitt / ismir2017-deepsalience

Companion code for ISMIR 2017 paper "Deep Salience Representations for $F_0$ Estimation in Polyphonic Music"
MIT License
83 stars 19 forks source link

How to make use of the resulted npz file? #2

Open lflee opened 7 years ago

lflee commented 7 years ago

Hello 🙋‍♂️',

I have successfully generated some npz files by running predict_on_audio.py. Then, how can I make use of them? Can you suggest some software to "translate" them to midi? Or other formats which are importable by MuseScore, LilyPond, etc?

Thanks 👍

rabitt commented 7 years ago

Hey @lflee the predict_on_audio.py script is currently written to output in one of two formats -- salience (the npz files), multif0 csv files with time stamps and multiple frequency values (for polyphonic music), and singlef0 csv files for monophonic music.

The npz files contain matrices representing the estimated likelihood of each frequency bin over time. You can sonify them directly using mir_eval.sonify.time_frequency. If you want to convert them into a symbolic format you'll have to write code that does it. I'm happy to look at a PR if you do!

Magurosenbei commented 6 years ago

How do we obtain the 'gram' as required by mir_eval.sonify.time_frequency ?

rabitt commented 6 years ago
import numpy as np

# load an npz file created by predict_on_audio.py
data_dictionary = np.load("mynpy_file.npz")

times = data_dictionary['times']
freqs = data_dictionary['freqs']
salience = data_dictionary['salience']

salience is the gram required by mir_eval.sonify.time_frequency and times, freqs are the time/frequency grids needed to render the gram.