jupyterhub / binder-data

A place to store data for Binder
9 stars 13 forks source link

Basic tools for parsing GA data #14

Closed betatim closed 5 months ago

betatim commented 6 years ago

This is the code I used to produce the data mentioned in #12

betatim commented 6 years ago

If people are happy with the level of summarisation done I will add the actual data files to this PR before merging.

choldgraf commented 6 years ago

cool! FWIW, I've been using this repo just for data and scripts to update data, and was planning to use other repositories for analyzing/visualizing/etc. E.g. the billing data exists here at billing/ but all the analysis scripts live at https://github.com/jupyterhub/binder-billing. Do you think this is too convoluted? I am 50/50

betatim commented 6 years ago

Yes, these notebooks only munge the data from its raw form into the "public" version.

choldgraf commented 6 years ago

yes you think it's too convoluted?

betatim commented 6 years ago

No. I like the split between data preprocessing and doing stuff with it.

Now I am confused though. I read your original comment as "this PR contains stuff that isn't just data preprocessing and should be in a different repo" which is why I replied "this is just preprocessing".

The notebooks in this PR preprocess the data. The daily visits one also makes a quick plot of the data to verify the munging worked before writing it to a CSV.

yuvipanda commented 5 months ago

5 years late is better than never <3