rfaulkner / wikipedia_user_metrics

Wikimedia Foundation E3 Team Analysis Code
Other
9 stars 5 forks source link

API. Time series and group_by methods #65

Closed dartar closed 11 years ago

dartar commented 11 years ago

Run a daily cronjob to generate and refresh time series for the following use cases:

1. time series grouped by registration date

A. Take a union of cohorts (e.g. all users entering the ReturnTo funnel from multiple GettingStarted iterations), group users by registration date, select a metric (e.g. threshold, with default parameters) and return an aggregate response for each date with the appropriate aggregator method (e.g. proportion).

B. Same as above, use a registration date group_by method, but instead of running a metric on a cohort or set of cohorts, apply it to all users in a project using the all magic keyword

2. time series grouped by activity period

C. Generate project-level metrics computed from all users using all, binned by activity period. E.g. 5+ ns0 active editors: run the threshold metric on a specified date range, plot the proportion and iterate through all date ranges.

dartar commented 11 years ago

Depends on support for some basic API client. #60

rfaulkner commented 11 years ago

60 is complete.

https://github.com/rfaulkner/umapi_client

rfaulkner commented 11 years ago

https://github.com/rfaulkner/umapi_client/commit/fa924b6049e13222e56e960167a8f9ec2340ea5e

crontab on stat1:

00 0 * * * /home/rfaulk/call_umapi_cli.sh 2>&1 > /dev/null