kovaxis / midly

A feature-complete MIDI parser and writer focused on speed.
The Unlicense
133 stars 25 forks source link

How fast is this midi parser? #1

Closed Boscop closed 5 years ago

Boscop commented 5 years ago

I'm curious, which crate is the fastest at parsing Midi files. Have you done any benchmarks / measurements against these two? :)

https://crates.io/crates/rimd https://crates.io/crates/nom-midi

I'm currently using rimd, but if your crate is faster, I'd switch to it..

kovaxis commented 5 years ago

I just carried out some rough benchmarks using the currently latest versions of these libraries: midly v0.2.1, nom-midi v0.5.1 and rimd v0.0.1.

I tried parsing the 24MB "Pi.mid" file. These were the results in my machine (including file IO):

test pi_midly ... 30 tracks in 210ms
test pi_nom ... 30 tracks in 407ms
test pi_rimd ... 30 tracks in 23579ms

This is the code I used to test them:

fn pi_midly() {
    let start = Instant::now();
    let data = fs::read("Pi.mid").unwrap();
    let smf = midly::Smf::parse(&data).unwrap();
    println!("{} tracks in {}ms", smf.tracks.len(), start.elapsed().as_millis());
}

fn pi_nom() {
    let start = Instant::now();
    let data = fs::read("Pi.mid").unwrap();
    let smf = nom_midi::parser::parse_smf(&data).unwrap().1;
    println!("{} tracks in {}ms", smf.tracks.len(), start.elapsed().as_millis());
}

fn pi_rimd() {
    let start = Instant::now();
    let smf = rimd::SMF::from_file("Pi.mid".as_ref()).unwrap();
    println!("{} tracks in {}ms", smf.tracks.len(), start.elapsed().as_millis());
}

As you can see, midly is the fastest out of these three. As for why midly is the fastest, I recall doing some benchmarks while I was developing midly, and it used to be only slightly faster than nom-midi; it was the auto-parallelization that made the difference. rimd is orders of magnitude slower, owing to the fact that it makes micro-allocations for every single event. It was actually the sluggishness of rimd that motivated me to write midly (no offence at all, after all it does do the job at the end of the day).

Hope I helped :)

Boscop commented 5 years ago

Thanks! Btw, I'm curious: What's your use case where you need to load large midi files, and where do you find them? Are you a black midi artist or something? :)

kovaxis commented 4 years ago

Sorry for the late response, I just haven't been doing a lot of coding lately and therefore don't pay attention to github.

I was in the middle of doing an ultra-optimized synthesia clone with support for black MIDIs, and eventually loading the files became the main bottleneck. So no, sadly I'm not a black MIDI artist even if I'd like to D':

Now I just "maintain" (if tuning in every 2 months is maintaining) the library for fun.

What was your use case? I'd love to hear about any uses of this library.

Boscop commented 4 years ago

@negamartin Sorry for the late reply, I was very busy with work.. My hobby project is still on rimd but I think I'm gonna switch to midly soon. It's a midi editor, so mostly dealing with small midi files, under ~80 KB, but it has a preview function to quickly see the arrangement of a midi file when hovering over a midi file, so the loading should be fast. Currently with rimd it's kinda slow.. Usually, due to the overhead, parallelization only leads to speed ups when there's a lot of data to process (so that there is more time spent doing parallel work than winding it up & down). Have you run your benchmark on smaller midi files like the typical ~35 kb to ~80 kb song files? Maybe it would make sense to allow disabling parallelization for small files? For my use case, the loading time of these small midi files is what counts the most.

Btw, thanks for adding the writing functionality, that will come in handy :)

kovaxis commented 4 years ago

I've added a more permanent benchmarking solution to the repo, comparing it against nom-midi and rimd, and tested it with a ~50KB file, with and without multithreading enabled (you can disable multithreading by disabling the std feature).

Turns out that the first time a file is read multithreading has a small but non-negligible (like half a millisecond) cost, but afterwards it's beneficial even in small files (down to about ~10KB, at which point it takes tenths of a millisecond to read and parse files, so it doesn't really matter much). I believe rayon, (the crate used for multithreading) creates a global threadpool once and then reuses it, making multithreading almost free.

I checked it against rimd and nom-midi again with small files: turns out there were some performance regressions which made single-threaded performance worse than nom-midi, but now it should all be sorted out. And yeah, on the ~50KB file rimd takes ~140ms, it's pretty slugish. Parallel midly hovers around ~0.3ms, single threaded midly around ~0.5ms, and nom-midi around ~0.55ms. At this point such small differences don't really matter.

I tried with an ~80KB file but nom-midi seems to be very picky about the files it accepts.

I think a MIDI editor in Rust is really cool. Do you think you could make the repo public by any chance? I'd like to contribute something every now and then, although I don't promise anything. It's okay if you want to keep it as your own tho.

Boscop commented 4 years ago

Thanks! I'll switch to midly then. I've been holding off a bit because when I started using rimd, it failed on many midi files, and I had to submit a lot of small fixes and also do my own post-processing to work with non-well-formed midi files, so my current loader is working, but it's a little messy because it also contains normalization code, so I need to factor that out to be independent of the actual loader.. It seems you already ran midly on a lot of midi files, to check that it succeeds to load even rare edge cases, right?

Regarding my use case, well "midi editor" is only the simplest way to describe this, it's not really a general purpose GM editor, it's more like a custom / highly specialized component for a demo (as in demoscene), which is not allowed to be published before being officially released.. But I think it's ok to show a little preview of this component.

After it's released, I'll try to factor out as much of the components as possible to make them useable by more people.. I also wrote a UI lib on top of several midi controllers (Launchpad Pro & BCR2000) that could be useful..