ELVIS-Project / vis-framework

Thoroughly modern symbolic musical data analysis suite.
http://elvisproject.ca/
31 stars 6 forks source link

Pandas documentation #399

Closed musicus closed 8 years ago

musicus commented 8 years ago

Provide, additional pandas documentation in regards to music theory/musicology use cases.

minamouse commented 8 years ago

To be added:

What am I missing?

minamouse commented 8 years ago

Thanks! @alexandermorgan do you have any more suggestions to add?

alexandermorgan commented 8 years ago

Yes, I would point out that .value_counts() only works on a series. If you want to flatten one of our multi-indexed dataframes into a series and then count those events, you can do this noterestDF.stack().stack().value_counts() To just get the index of a dataframe: df.index Just the columns: df.columns though this one is a little more complex given our multi-index If you want to do index-based selection of a column in a dataframe, you can do this: df.iloc[:, column_number] Label-based is a little trickier given our multi-indexed columns, but it still works if you pass a tuple with the label for each level of the multi-index: df.loc[:, ('noterest.NoteRestIndexer', '2')] Getting a row is the same syntax: df.iloc[index_number_of_row, :] Slicing also works in the regular python way: df.iloc[3:, :] Figuring out all the places a certain value is found is also valuable, but I can't think of the "where" syntax at the moment. It's something like numpy.where... One of the most useful things is how to filter. To get the notes of a piece without the rests: just_notes = noterestDF[noterestDF != 'Rest'] I think that's enough to get people going.

alexandermorgan commented 8 years ago

Also, if you're going to distinguish between applymap and map, you should probably explain apply too, and this in the context of series and dataframes because I'm pretty sure they're different (applymap is only for dataframes). But those are pretty complex functions, I'm not sure you would want to venture into all that for a little intro to pandas.

minamouse commented 8 years ago

What's the benefit of using .value_counts() over the frequency experiment that we have in VIS? Don't they do the same thing?

alexandermorgan commented 8 years ago

For anyone who's looking, Marina's pandas documentation is here.

musicnerd commented 8 years ago

Hi guys, as someone working my way through the pandas documentation I had a few comments: 1) there are 2 broken links at the top of the page: "One step beyond" (which in the text body is actually called "one step further" and the one underneath that, "Advanced pandas" which doesn't actually seem to have made it into the body at all, so perhaps it needs to be deleted? Also I can see you guys talking about subjects such as .value_counts() but I don't see that anywhere in the documentation. Am I missing something?

minamouse commented 8 years ago

You can go ahead and change the broken links, I might have changed some of the organization of the text along the way and forgotten to change the index at the beginning. Value counts is something that we do in VIS called the frequency experimenter. I left it out of the pandas documentation here because of that. Having an experimenter doing something that a pandas function can do might be a separate issue though.

musicnerd commented 8 years ago

Marina: In the section of the wiki under "concatenating" you paste the line of code that you need to cat the two dataframes together. But then, you show two dataframes (notes and intervals) aligned by the same index (or "pasted" together) but you don't show the line of code used to generate that. Could you either add it or just reply here and I can add it? That seems useful.

minamouse commented 8 years ago

Sorry about that @musicnerd. The code is almost exactly the same: pandas.concat([df1, df2], axis=1)