justinsalamon / audio_to_midi_melodia

Extract the melody from an audio file and export to MIDI
562 stars 103 forks source link

Converted mid file play just ding... dang dong... #12

Closed audiorecorder closed 5 years ago

audiorecorder commented 5 years ago

I have hardly compiled and installed it ...

"python audio_to_midi_melodia.py a.wav a.mid 60 --smooth 0.25 --minduration 0.1 --jams" "python audio_to_midi_melodia.py a.mp3 a.mid 60 --smooth 0.25 --minduration 0.1 --jams"

but the converted mid file play just ding... dang dong.. ding... I can not recognize what the original audio/music is.

justinsalamon commented 5 years ago

The quality of the output depends on on two main things:

  1. The quality of the melody extraction (i.e. how well the melody frequency has been estimated)
  2. The quality of the note segmentation/quantization (i.e. dividing the continuous melody into individual notes)

(1) is handled by the Melodia algorithm, and normally works pretty well for e.g. pop/rock/folk genres, but the performance does depend on the type of music you are analyzing. If you want to see how well melodia works on its own (without note quantization) you can run the algorithm inside Sonic Visualiser and then listen to the estimated melody.

(2) involves converting the continuous frequency time series estimated by melodia and converting it into a series of discrete notes. This is in itself a research problem and not necessarily a solved one, and is implemented here in the most simple/basic way possible. As such, it is likely to produce sub-optimal notes (e.g. single notes that have been split into many notes or vice versa, notes with vibrato may end up having the wrong pitch value, etc.).

Finally, note that you also need to choose a tempo (BPM) for the output midi file, and you are using 60, which is quite slow for most songs. Even if the quality of the melody extraction and note quantization was good, if your song is faster than 60 BPM the MIDI output will sound off. I suggest estimating the tempo of your song and using the appropriate value for the BPM input parameter.

Closing this out since there doesn't seem to be any technical issue to resolve here.