ELVIS-Project / vis-framework

Thoroughly modern symbolic musical data analysis suite.
http://elvisproject.ca/
31 stars 6 forks source link

Remove capability for recursion in IndexedPiece.get_data() #416

Closed alexandermorgan closed 7 years ago

alexandermorgan commented 7 years ago

get_data() currently supports recursive calls which allows it to queue up multiple indexer runs in one command. When more than one indexer is included in the call, it necessarily feeds the result of the nth indexer to the call to the n+1th indexer. Here's the recursive block from the code:

        if len(analyzer_cls) > 1:
            if analyzer_cls[0] is noterest.NoteRestIndexer:
                return self.get_data(analyzer_cls[1:], settings, data)
            return self.get_data(analyzer_cls[1:], settings, analyzer_cls[0](data, settings).run())

There are two main issues with this:

  1. Only the results of the last indexer are returned
  2. You can only feed the results of one indexer to the next indexer in the call

This second point is the most problematic because most indexers that require the results of other indexers require the results of several indexers. This was probably originally meant for interval indexing, but now the noterest indexer results are internally called by the _get_vertical_interval() and _get_horizontal_interval() methods. The only indexer that could make use of this (unless I'm mistaken) is the ngram indexer when creating ngrams based only on one input, such as durations and this is not the main use case for the ngram indexer. I think we should just limit this method to one indexer at a time which will make it easier to use, and easier to understand. Another related issue is that get_data() can currently make repeated calls to _import_score(), even though this never actually results in a score being imported twice because of _import_score()'s internal check. If we're going to change this it should be done for VIS 3.0.

crantila commented 7 years ago

Making me earn my salary today!

There are two main issues with this:

  1. Only the results of the last indexer are returned
  2. You can only feed the results of one indexer to the next indexer in the call

Neither of these would be problems if the analyzers followed the intended behaviour, each returning a single DataFrame with cumulative results from all previous analyzers. It was originally meant for everything to be done in a single call to get_data(), separating the three phases of music analysis: prepare, calculate, and output. Note how those phases correspond to load(), run(), and output() in the WorkflowManager.

Unfortunately, I never solved the "addressed settings" issue, I didn't implement the "cumulative" part of the results, and the WorkflowManager is too complicated to modify. Nobody else got used to the "three phases" idea, so you're pushing toward (in my opinion) a clumsier but simpler architecture in VIS 3. (Sure I have my misgivings, but if a design is too complicated to even see, never mind maintain, then it obviously wasn't the best design)!

So you're right that this recursive capability should be removed. Hopefully the explanation will help you understand how VIS 3 is fundamentally shifting the Framework's design, so there are probably other capabilities to remove too. Feel free to ask for help with specific tasks.

alexandermorgan commented 7 years ago

If you're really up for working on this, would you be able to address this issue? You would have to take alex_devel as your starting point and push to develop when you're done.

crantila commented 7 years ago

What's the timeline for completion?

alexandermorgan commented 7 years ago

Time is one thing I'm short on. I'm planning on publishing VIS 3.0 sometime next week so I would need this by Wednesday Sept. 14. If you're up for this it would be greatly appreciated, but there's no pressure, especially since it would require a quick turnaround.

crantila commented 7 years ago

Looks like there are a handful of features that won't make it in but ought to. I don't expect this to be fast, so I guess it'll be one of them. Sorry!

alexandermorgan commented 7 years ago

Ok, thanks for letting me know. I'd like to get to this for VIS 3.0 because, although it's not an enormous change, it does change the way some analyses would need to be called, and therefore is not reverse compatible so should be part of a major release. If there's time next week I'll try to do this, if not it will have to wait.

alexandermorgan commented 7 years ago

Recursion has been removed but this has to stay open until I'm done testing the new setup. This turned out to be a little easier than I anticipated. The new way to call analyzers does not use lists:

ip = ImportScore('path_to_piece.xml')
ip.get_data(noterest.NoteRestIndexer)

For convenience, each of the two following lines is equivalent to the last line above (NB: they use strings):

ip.get_data('noterest.NoteRestIndexer')
ip.get_data('noterest')

Two-letter strings (the first two letters of each analyzer) were also implemented but were rejected at the ELVIS meeting yesterday for fear of collisions.

alexandermorgan commented 7 years ago

In the interest of coherence, this has to be done in aggregated_pieces too.

alexandermorgan commented 7 years ago

@crantila quick question, in the documentation I'd like to provide a list of experimenters that can be passed as the "aggregated_experiments" argument of the aggregated_piece.get_data() method. There are six experimenters in VIS 3.0 (NB: dendrogram is out):

  1. aggregator.ColumnAggregator
  2. barchart.RBarChart
  3. frequency.FrequencyExperimenter
  4. lilypond.AnnotateTheNoteExperimenter
  5. lilypond.LilyPondExperimenter
  6. lilypond.PartNotesExperimenter

Is it correct to say that experimenters 1 and 2 in the list above are the only valid ones for this argument?

crantila commented 7 years ago

Is it correct to say that experimenters 1 and 2 in the list above are the only valid ones for this argument?

The FrequencyExperimenter might be useful as an aggregated experiment too, though probably not often. The LilyPond ones shouldn't be aggregated. In fact, AnnotateTheNoteExperimenter looks like it should actually be an Indexer. Oops!

You can also change the parameter names in that method... it's a little unweildy.

And since I'm already there: please ask @mborsodi to add her name with a new copyright statement to aggregated_pieces.py!

alexandermorgan commented 7 years ago

Now, depending on whether you call the Frequency or ColumnAggregator experiments via the ind_analyzer or combined_experimenter arguments, you get a list of multiple dataframes or a single dataframe respectively. Still needs some new documentation examples before VIS 3.0 gets published, but I'm going ahead and closing this.