datamade / data-analysis-guidelines

📒 Analyzing Data, the DataMade Way
MIT License
36 stars 4 forks source link

document matplotlib patterns in appendix #13

Closed hancush closed 4 years ago

hancush commented 6 years ago

our current forays into plotting with python have been functional, but slightly disorganized. let's assess our current tooling (some pandas.DataFrame.plot shorthand, some use of the matplotlib.pyplot api), as well as some alternatives, including but not limited to: altair, ggplot (for python), "object-oriented" matplotlib, and seaborn.

once we've done our assessment, let's pick the one we like best, and commit that to the toolkit canon for all.

hancush commented 6 years ago

hc notes

here's a nice breakdown of some popular options for charting in python.

i think the reason that our current approach to charting is so ugly has a lot to do with using the shorthand plotting api from pandas rather than using the matplotlib (mpl) api. moving forward, i think i would be comfortable saying we will not use pandas shorthand (or limit use of said shorthand); it obscures away some very important things about mpl – i.e., there’s actually an underlying structure – such that it impedes understanding, and too often behaves counter to expectations.

i'm going to spend some time with the mpl documentation.

if we're very unhappy with mpl, i'd cast my lot for seaborn. it provides a higher level api for common statistical charts, and the output looks a lot nicer, too. it's also built on top of mpl, so the syntax is simplified but not wholly different, which makes it seem easier to learn to me.

regardless of the tooling we choose, i'd like to make a conscious effort to actually learn what i'm doing, rather than scouring stackoverflow for hacks.

fgregg commented 6 years ago

From this person on twitter: https://twitter.com/ktrst/status/985968491852369921

hancush commented 6 years ago

we've decided we like matplotlib. to wrap this in a bow, i'd like to add an appendix with charting patterns (and anti-patterns) to this repo.