ELVIS-Project / vis-framework

Thoroughly modern symbolic musical data analysis suite.
http://elvisproject.ca/
31 stars 6 forks source link

Redo settings of indexers and experiments; make resulting DataFrames more 'complete' #375

Closed mrbannon closed 7 years ago

mrbannon commented 8 years ago

Let me explain:

In order for the N-Gram Indexer to function properly, the result from the Horizontal Interval Indexer (HII) must be generated with the setting "simple" and not "compound". What do these settings mean, though?

simple/compound - use compound intervals instead of simple

Should this really be a setting? No, because the simple intervals can be computed from the compound ones. So, the resulting DataFrame should be able to provide simple intervals without having to be told so upon creation.

The results generated by the indexers should be complete regardless of the settings.

musicus commented 8 years ago

You bring up a good point.

alexandermorgan commented 8 years ago

An alternative to using compound intervals would be to have a dataframe of music21 interval objects that could then be queried in any specific way desired. This seems like the base representation of intervals to me.

alexandermorgan commented 8 years ago

The salami-slicing-indexed compound intervals with quality could be the version of intervals that we store as an attribute of an indexed piece. Then when we use get methods to get the intervals, they could get these compound intervals and convert them to whatever type the user wants. I'm close to having this on alex_devel.

crantila commented 8 years ago

This points toward what Jamie and I (mostly Jamie) thought an "ideal case" for data management would look like, where indexed data are stored in a SQL or SQL-like database and queried as required. I wonder if that larger change would be worth pursuing for VIS 4? It could solve several problems at once.

alexandermorgan commented 8 years ago

I think it's important to note that your original statement that the n-gram indexer requires simple horizontal intervals is actually not true. The n-gram indexer will use horizontal-interval observations of any kind, even nonsensical ones, as long as they're strings, ints, or floats. Also, considerable progress on this issue has been made in alex_devel.

alexandermorgan commented 7 years ago

This has been implemented to the greatest extent possible in VIS 3 (already present in alex_devel). This is not possible for the n-gram indexer (Ryan's original example) because there are an infinite number of results this indexer can produce, and the input can be the results of any indexer or any combination of multiple indexers. However the n-gram indexer (or rather the new_ngram indexer, soon to change names) is so heavily optimized that it is really not necessary to cache its results. Just coming up with and sticking to a naming convention for retrieving information from such a cache would be a daunting task.