JamesOwers / midi_degradation_toolkit

A toolkit for generating datasets of midi files which have been degraded to be 'un-musical'.
MIT License
38 stars 5 forks source link

Degradation optimizations #36

Closed apmcleod closed 5 years ago

apmcleod commented 5 years ago

This makes and tests 3 performance changes to the degradations:

  1. Converts the input format from Compositions to pd.DataFrames. This is in regards to #25, which I will leave open for now in case we want to reconsider this format again (into, e.g., dicts, arrays...)
  2. Adds an inplace parameter to each degradation. inplace=False is the default, and performs exactly as before. inplace=True changes the given excerpt parameter in place, and returns None. This is noted in the docs under inplace, excerpt, and the return value. It is also noted under inplace for add_note and split_note that this can make performance worse. Overall, we see little if any improvement, but it is worth including so people can use the degradations this way if they wish. This fixes #35. Tests were added for the inplace param and return values.
  3. Vectorizes all of the operations. There is no more looping over indices (and I fixed a few bugs that would've cropped up with non-consecutive indices, though they shouldn't now--TODO: we may want to add tests for this). There is still the nested groupby in join_notes, but I'm not sure how to fix this. This leads to a significant performance improvement. This fixes #34.