TuringDataStories: An open community creating “Data Stories”: A mix of open data, code, narrative 💬, visuals 📊📈 and knowledge 🧠 to help understand the world around us.
TfL publishes usage data for Boris Bikes: https://cycling.data.tfl.gov.uk/ It’s actually really detailed, to the point where I wonder if it might have similar privacy issues as NYC’s infamous taxi data: It shows each individual journey, in the format of “bike number X left bike station Y at time Z and arrived at station V at time W”.
I've done in the past an analysis where I tried to forecast how many bikes would be leaving/arriving at a given station, based on historical trends and weather data: http://nbviewer.org/github/mhauru/boris-bike-forecast/blob/master/analysis.ipynb We could base the story on that or take some entirely different angle, there's plenty you could do with the data given how detailed it is (most popular routes, popularity of routes as a function of time, effect of events or infrastructure changes on Boris Bike usage, all sorts of map visualisations, etc etc.). Proposals welcome.
Ethical guideline
Ideally a Turing Data Story has these properties and follows the 5 safes framework.
[ ] The analysis you produce is openly available and reproducible.
[ ] Any data used are open and have an explicit licence, provenance and attribution.
[ ] Any data used are not personal data (i.e. the data is anonymous or anonymised).
[ ] Any linkage of datasets in your data story does not lead to an increased risk of the personal identification of individuals.
[ ] The Story must be truthful and clear about any limitations of analysis (and potential biases in data).
[ ] The Story will not lead to negative social outcomes, such as (but not limited to) increasing discrimination or injustice.
Story description
TfL publishes usage data for Boris Bikes: https://cycling.data.tfl.gov.uk/ It’s actually really detailed, to the point where I wonder if it might have similar privacy issues as NYC’s infamous taxi data: It shows each individual journey, in the format of “bike number X left bike station Y at time Z and arrived at station V at time W”.
I've done in the past an analysis where I tried to forecast how many bikes would be leaving/arriving at a given station, based on historical trends and weather data: http://nbviewer.org/github/mhauru/boris-bike-forecast/blob/master/analysis.ipynb We could base the story on that or take some entirely different angle, there's plenty you could do with the data given how detailed it is (most popular routes, popularity of routes as a function of time, effect of events or infrastructure changes on Boris Bike usage, all sorts of map visualisations, etc etc.). Proposals welcome.
Ethical guideline
Ideally a Turing Data Story has these properties and follows the 5 safes framework.
Current status
Updates