alan-turing-institute/TuringDataStories

Our stories are published online using Quarto and GitHub Pages: you can check them out here. Looking for how to get involved? Click here.

Our vision

Our aim is to help people understand the data driven world around us. We want to inspire an open community around a central platform. One that encourages us all to harness the potential of open data by creating 'data stories'. These 'data stories' will mix computer code, narrative, visuals and real world data to document an insightful result. They should relate to society in a way that people care about, and be educational. They must maintain a high standard of openness and reproducibility and be approved by the community in a peer review process. The stories will develop data literacy and critical thinking in the general readership.

What is a Turing Data Story?

A Turing Data Story is an interactive mix of narrative, code, and visuals that derives insight from real world open data. They are written as pedagogic Jupyter notebooks that aim to spark curiosity and motivate more people to play with data.

We expect that the notebook of a data story takes the reader through each step of the analysis done to create the data story results. Turing Data Stories should follow these principles:

The story should be told in a pedagogical way, describing both the context of the story and the methods used in the analysis.
The analysis must be fully reproducible (the notebooks should be able to be ran by others using a defined computer environment).
The results should be transparent, all data sources are correctly referred to and included.
In order to maintain the quality of the results, the Turing Data Story should be peer-reviewed by other contributors before published.

We don't expect sophisticated analyses, just interesting stories told with data. If you have an idea of a Turing Data Story you want to develop please follow our contributing guidelines to make sure your contributions can be easily integrated in the project.

Contributing

This repository is always a work in progress and everyone is encouraged to help us build something that will be useful to the many.

How can I get involved?

Story ideas: Have an idea for an interesting story that could be told if you had the data, or knew how to analyse it? We can help.
Data: Stumbled across an interesting dataset, or perhaps mashed together several sources of data yourself? We want to hear about it.
Code: Are you an expert in Bayesian analysis? Do you have sick matplotlib skills? Put that knowledge to work!
Peer Review: Know a bit about data analysis? Good at communicating that knowledge? Interested in learning about it can be applied to understanding society? We need reviews to make sure our stories are the best they can be.
Communication: Are you an amazing writer? Help us with the story telling side of our stories.
Community: Don't fit in any of the above categories, but still want to hang out and be involved? We've got you, drop us a line.

The process for proposing a story and reviewing a story can be found in our submission and review guidelines. All contributors are asked to follow our code of conduct and to checkout our contributing guidelines for more information on how to get started.

How to Read Stories

Our stories are published online using Quarto and GitHub Pages. You can check them out here.

Alternatively, click the binder badge at the top of this README to load an interactive version of our stories.

To build the website locally, install Quarto and run from the top-level directory of this repository:

QUARTO_DENO_EXTRA_OPTIONS=--v8-flags=--stack-size=2048 quarto render

Note that Quarto uses precalculated outputs for each notebook cell.

Another option is to run the notebooks locally yourself. Some of the notebooks have requirements.txt files inside their respective subdirectories; you can set up a virtual environment to run the notebooks using

python -m venv tds_venv
source tds_venv/bin/activate
python -m pip install -r requirements.txt

If this is not present, then you will need to instead use the binder/environment.yml file with conda:

conda env create -f binder/environment.yml

Any problems, open an issue!

Adding a new story

Under the stories directory, create a new folder with the name YYYY-MM-DD-<Title> and place your notebook inside there. Make sure you have already run all the cells in your notebook. Add a preview.png with the figure you want to be previewed with Quarto. That's all!

If your notebook is not ready to be published to the web, you can prefix the folder with an underscore: Quarto will then ignore it.

About the project

This project was initially formed by a desire to contribute and advance to the analysis of government COVID-19 data.

As part of this process we recognised that government reporting of COVID-19 data was not always in the most accessible format. We also recognised that especially during these times, many individuals may be interested in developing their technical skills in an impactful way, but not know where to start.

Our goal was therefore to help provide educational data science content that would guide the user through the process of making the data accessible, to using the data for analysis.

We hope that by using the story telling medium, we can bring people along the data science journey and showcase how these techniques can answer both fascinating and socially relevant questions.

The team

The team is currently composed of four members:

David Beavan - GitHub:@DavidBeavan Twitter:@DavidBeavan Web:https://www.turing.ac.uk/people/researchers/david-beavan
Camila Rangel Smith - GitHub:@crangelsmith. Twitter:@CamilaRangelS. Web:https://www.turing.ac.uk/people/researchers/camila-rangel-smith
Sam Van Stroud - Github: @samvanstroud. Web:https://www.turing.ac.uk/people/enrichment-students/sam-van-stroud
Kevin Xu - Github: @kevinxufs

We currently meet every Wednesday afternoon

Citing TuringDataStories

Beavan, D., C. Rangel Smith, S. Van Stroud, and K. Xu. Turing Data Stories, 2020. https://github.com/alan-turing-institute/TuringDataStories.

@misc{beavan_turing_2020,
    title = {Turing {Data} {Stories}},
    url = {https://github.com/alan-turing-institute/TuringDataStories},
    author = {Beavan, D. and Rangel Smith, C. and Van Stroud, S. and Xu, K.},
    year = {2020}
}

Get in touch

You can join our community at Slack 🏡 (turingdatastories.slack.com) by opening an issue here along with your email id. We virtually meet on Wednesday afternoons to work collaboratively.

Contributors ✨

_kevinxufs 🤔 ⚠️ 🖋 💻 📖 📆	_{Camila Rangel Smith} 🤔 ⚠️ 🖋 💻 📖 📆	_{David Beavan} 🤔 ⚠️ 🖋 💻 📖 📆	_{Sam Vs} 🤔 ⚠️ 🖋 💻 📖 📆	_{Yo Yehudi} 📖 🤔	_{Louise Bowler} 👀	_nbarlowATI 👀
_{Martin O'Reilly} 🤔	_{Eric Daub} 📝 💻 🤔 🖋	_{Jack Roberts} 👀 📝 🤔	_billfinnegan 🤔 👀 🖋 💻	_{Helen Duncan} 💻 🔣 🤔 📆 👀 🖋	_{Christina Last} 💻 🔣 🤔 👀 🖋	_lukehare 💻 🔣 🤔 👀 🖋
_{Markus Hauru} 👀 💻 📆 🖋 🤔	_{Radka Jersakova} 📆 🤔 📖 🚇 👀	_{Ed Chalstrey} 🤔 👀	_joecerniglia 🤔 🖋 💻 🔣	_{Callum Mole} 👀	_{Aoife Hughes} 🤔 🖋 💻 🔣	_{Nathan Simpson} 🚇 🤔
_{Jonathan Yong} 🚇 🤔 💻 👀 🖋	_{David Llewellyn-Jones} 🤔 💻 👀 🖋	_{Isabel Fenton} 🤔 💻 🖋	_{Katriona Goldmann} 🤔 💻 🖋	_{Ryan Chan} 🤔 💻 🖋 👀	_{Eirini Zormpa} 👀	_{Jennifer Ding} 👀
_{Ed Chapman} 🤔 💻 🖋	_martin 🤔 💻 🖋	_myyong 🤔