MTG / essentia

C++ library for audio and music analysis, description and synthesis, including Python bindings
http://essentia.upf.edu
GNU Affero General Public License v3.0
2.81k stars 525 forks source link

Several MIR-related questions. #607

Open pavlos163 opened 7 years ago

pavlos163 commented 7 years ago

Hi there, thanks for all the superb work you've done.

I had some MIR-related questions to ask to know how Essentia can help me with my project. These are NOT issues with the code, I just wanted to clarify some things. I am trying to create an automatic music transcription web application for guitar and I have several questions (mainly for multi-pitch detection).

1. How should I use MultiPitchKlapuri or MultiPitchMelodia? I see that it takes as input the whole input signal and returns a 2D array. Am I right in thinking that 2D array the list of pitches per frame? 2. As my input is a guitar signal, I would like to set the maximum number of sources to 6. Is that possible? 3. Is maxFrequency and minFrequency only for pitches, or is it also for harmonics? Because I would like to keep harmonics over 1600Hz (guitar max), but I would like to "cancel out" pitches over 1600Hz. 4. If I already have the list of correct onset times, then what is the correct way to find the pitches at those times? Would a simple pitches[o] where o is the onset frame be enough? 5. Is there a good way to use Essentia to distinguish between when a chord was played and when 1 or 2 notes were played? I am using chord detection from a different library (Madmom) which returns a number of chords detected in the piece but there are many false positives. Can Essentia help me, somehow, reduce those to only the real chords played? (see more details here).

pavlos163 commented 7 years ago

I see that when I apply MultiPitchKlapuri to my signal, I get a 2D list that has a length of about 7000. I assume this is the number of frames. Is there a way to convert those frames to seconds or samples?

dbogdanov commented 7 years ago
  1. Correct, the output vector contains pitch values for each frame. However, this is not a 2D matrix but a list of lists. For each frame, it contains a list of pitch values for the pitch contours present in that frame.

  2. The output pitch values are not sorted by the salience of the pitch contours they belong to. We'll have to adapt the MultiPitchKlapuri and MultiPitchMelodia algorithms to be able to sort contours by their salience. To have fewer pitch contours detected by the algorithm, you can raise the magnitudeThreshold for spectral peaks.

pavlos163 commented 7 years ago

It doesn't seem like raising the magnitudeThreshold changes the results I get...