salu133445 / muspy

A toolkit for symbolic music generation
https://salu133445.github.io/muspy/
MIT License
427 stars 49 forks source link

Merging objects #33

Closed cifkao closed 3 years ago

cifkao commented 3 years ago

This PR adds support for merging MusPy objects. One situation where this might be useful is when we want to merge events from two different tracks, maybe because they were in two different channels in the MIDI file, but logically they are the same track.

Changes:

Not sure if it's useful to be able to merge objects of different types, or if we should throw an exception in that case.

A problem with remove_duplicates (apart from #32): It's only defined for ComplexBase, so for example it doesn't work for metadata.creators. I had to add an exception in the tests because of this. That makes me wonder if sort and remove_duplicates should be moved to Base (since list attributes are not just a feature of ComplexBase).

salu133445 commented 3 years ago

Thank you for the pull request!

Some thoughts:

cifkao commented 3 years ago

Everything you wrote makes sense to me.

  • I think this should be reserved for ComplexBase (i.e., Music and Track). Merging two notes or two tempos doesn't make much sense to me.

Yes, limiting this to list attributes does make sense. I was trying to mimic the behavior of protobuffers, but with the current schema I can't think of a good use case for this either. :slightly_smiling_face:

  • It seems that non-list attributes are simply left unchanged. If so, it sounds more like a extend that works on corresponding list attributes between two objects.
  • Also, I am not quite sure about what override=False do and what the potential use cases are.

Actually, the way it works now is that None values for optional non-list attributes get overridden by the other object. With override=True, all of them get overridden. But yeah, in reality this is probably not that useful.

  • I don't think we should sort primitive types. In fact, the only list attributes that contains primitive type is Music.downbeats, which is not well supported at this point and we could probably figure out a better way to handle this.

OK. The reason I cared about sorting numbers and strings was because of removing duplicates, but this will be done without sorting anyway, so I agree.


So since it looks like the only thing merge should do is extend all the list attributes (no sorting or removing duplicate items), maybe I should rename it to extend. And if a list is passed to it, it would call append on all items (in analogy with lists). And the same thing for the + and += operators. What do you think?

cifkao commented 3 years ago

Should this still (optionally) work recursively? E.g. when merging two Musics, should the metadata.creators also be merged?

salu133445 commented 3 years ago

Well. What if two music with different resolutions are to merge? Also, extending the time signatures doesn't sounds right to me. I feel like this only make sense for tracks within the same song, isn't it?

On the other hand, we should probably implement the method Base.extend similar to Base.append that automatically finds out the corresponding list to extend/append. Basically, what we are doing is to make Track behave like a list, where the actual operations apply directly to its list attributes.

Now, the key difference here will be the integrated supports for sort and remove_duplicates (which could also be done separately), and some copy/deepcopy issues.

cifkao commented 3 years ago

Well. What if two music with different resolutions are to merge? Also, extending the time signatures doesn't sounds right to me. I feel like this only make sense for tracks within the same song, isn't it?

Not necessarily. Let me explain the applications I have in mind.

First (most importantly), merging different tracks of the same song as you said. I imagine track1 + track2 or track1 += track2 should work (all the list attributes of track1 will be extended). And in the same vein, track1.extend(track2) should work the same way.

Second, merging two different music files. Let's say I have a music1 with a melody, tempos and time signatures, and I feed this to a 'harmonizer' algorithm that outputs a music2 with chords and a bass track (and nothing else). Now I want to merge these two, so I do music1 + music2 and get an object that has everything. Here it's my responsibility to make sure what I'm doing makes sense, i.e. I'm not combining incompatible or duplicate information.

Maybe the second example is a bit artificial. But in both cases it's a matter of convenience (not having to go through all the attributes and extend them one by one), so I thought it would be nice to have a general mechanism that can do this for any object.

So the logic would be:

And for simplicity, it's probably best not to care about sorting or duplicates in either case and let the user do that themselves afterwards.

cifkao commented 3 years ago

Another application which is definitely not artificial is concatenating pieces of music. We can do this by time-shifting one of them and then merging it into the other one. In this case both of them can contain time signatures.

This is what note-seq does (using the MergeFrom method of protobuffers): https://github.com/magenta/note-seq/blob/484e854b76fb90328a8a42f78014ac3df9a92f50/note_seq/sequences_lib.py#L470

salu133445 commented 3 years ago

I see your point.

I would prefer to have this implemented in another method (maybe fill_missing or update_none?) as it's quite different from the concept of extend. We could always treat a class as a dictionary and implement an update method, which would be similar to self.__dict__.update(other.__dict__), but I don't know if this would be useful in practice. The tricky thing here is always whether to copy or deepcopy though.

Concating would be useful. Time shifting can be done by Base.adjust_time(lambda time: time + shift), so concating can be done by x.extend(y.adjust_time(lambda: time: time + x.get_end_time())). We could make shift a separate method.

I think we could simply leave non-list attributes unchanged, and implement ComplexBase.extend with ComplexBase input by extending each corresponding list. This would be easier to understand and document.

Sorry for having lots of questions on this. I once planned to implement something similar, but it's quite tricky to make it straightforward yet useful. Thank you for working on this!

cifkao commented 3 years ago

I think we could simply leave non-list attributes unchanged, and implement ComplexBase.extend with ComplexBase input by extending each corresponding list. This would be easier to understand and document.

I agree.

Sorry for having lots of questions on this. I once planned to implement something similar, but it's quite tricky to make it straightforward yet useful. Thank you for working on this!

Sure. Maybe I should have made my intended use cases clearer from the beginning. :slightly_smiling_face:

cifkao commented 3 years ago

I think this is ready. The copying is indeed tricky so let me know how you would like it to work.

salu133445 commented 3 years ago

I made the following changes.

salu133445 commented 3 years ago

Well. It's not technically a view for sure. There's no other way to prevent creating a new list here as the items are OrderedDict now. Other than the OrderedDict objects themselves (and the lists that store them), the underlying data is the same. Let's put it this way - reduce the overhead as much as possible by default and provide a deepcopy arugument for supporting other usage.

Ondřej Cífka notifications@github.com 於 2020年12月31日 週四 下午8:03 寫道:

@cifkao commented on this pull request.

In muspy/base.py https://github.com/salu133445/muspy/pull/33#discussion_r550469002:

  • if attr in self._list_attributes:
  • List of non-MusPy objects

  • ordered_dict[attr] = deepcopy(value) if copy else list(value) else:
  • ordered_dict[attr] = value
  • Single non-MusPy object

  • ordered_dict[attr] = deepcopy(value) if copy else value

Here https://github.com/salu133445/muspy/blob/7fba527085140c2b07e22df182595bd6097bb4ae/muspy/base.py#L167-L169, a new list is created for each list attribute whose type is a subclass of Base. This new list is by no means a view of the original list; also all the newly created OrderedDicts are also not views of the original objects. Unless I am missing something.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/salu133445/muspy/pull/33#discussion_r550469002, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFK27V3KNF6HNBU477MOOJLSXRSALANCNFSM4VKHNKDA .