open-life-science / ols-2

Creative Commons Attribution Share Alike 4.0 International
11 stars 8 forks source link

Turing Data Stories #7

Open samvanstroud opened 3 years ago

samvanstroud commented 3 years ago

Project Leads:

Mentor: Yo Yehudi @yochannah

Welcome to OLS-2! This issue will be used to track your project and progress during the program. Please use this checklist over the next few weeks as you start Open Life Science program :tada:. *** Week 1 (31 August - 4 September 2020): Meet your mentor! - [x] Create an account on [GitHub](https://github.com) - [x] Check if you have access to the HackMD notes set up for your meetings with your mentor - [x] Prepare to meet your mentor(s) by completing a short homework provided in the HackMD notes - [x] Complete **your own copy** of the [open leadership self-assessment](https://docs.google.com/document/d/1oQgdfj4lPnypAyb9_Ba0Zt7E8J5L6qMvuKwu0wgQsjs/edit?usp=sharing) and share it to your mentor If you're a group, each teammate should complete this assessment individually. This is here to help you set your own personal goals during the program. No need to share your results, but be ready to share your thoughts with your mentor. - [x] Make sure you know when and how you'll be meeting with your mentor. Before Week 2 (7 - 11 September 2020): Cohort Call (Welcome to Open Life Science!) - [x] Create an issue on the [OLS-2 GitHub repository](https://github.com/open-life-science/ols-2/issues/new) for your OLS work and share the link to your mentor. - [x] Draft a brief vision statement using your goals [This lesson](https://mozilla.github.io/open-leadership-training-series/articles/introduction-to-open-leadership/stating-your-project-vision/) from the Open Leadership Training Series (OLTS) might be helpful - [x] Leave a comment on this issue with your draft vision statement & be ready to share this on the call - [x] Check the [Syllabus](https://openlifesci.org/ols-2) for notes and connection info for all the cohort calls. Before Week 3 (14 - 18 September 2020): Meet your mentor! - [x] Look up two other projects and comment on their issues with feedback on their vision statement - [x] Complete this [compare and contrast assignment](https://docs.google.com/document/d/1ukvqDRIYfvCapVMdE5hWP-0MkLNJ9T65X43O7F336Ac/edit?usp=sharing) about current and desired community interactions and value exchanges - [our compare and contrast](https://docs.google.com/document/d/1bI_p8j6FC8meDMNVzGS0v2-UTO4NV5sSgOSJS3Du8Bc/edit) - [x] Complete your Open Canvas ([instructions](https://mozilla.github.io/open-leadership-training-series/articles/opening-your-project/develop-an-open-project-strategy-with-open-canvas/), [canvas](https://docs.google.com/presentation/d/1MeJo0TyuMg_waLk1J4q9y1aAqKNMuRBlnmxEChSz-cQ/edit?usp=sharing)) - [our canvas](https://docs.google.com/presentation/d/1rmp1nEf9u-WymsspPqyDnvBdXBe8d87WxFwhzHCZMZM/edit?usp=sharing) - [x] Share a link to your Open Canvas in your GitHub issue - [ ] Start your [Roadmap](https://mozilla.github.io/open-leadership-training-series/articles/opening-your-project/start-your-project-roadmap/) - [ ] Comment on your issue with your draft Roadmap - [ ] Suggest a cohort name at the bottom of the shared notes and vote on your favorite with a +1 Before Week 4 (21 - 25 September 2020): Cohort Call (Tooling and roadmapping for Open projects) - [ ] Look up two other projects and comment on their issues with feedback on their open canvas. Week 5 and later - [x] Create a GitHub repository for your project - [x] Add the link to your repository in your issue - [x] Use your canvas to [start writing a `README.md` file](https://mozilla.github.io/open-leadership-training-series/articles/opening-your-project/write-a-great-project-readme/), or landing page, for your project - [x] Link to your README in a comment on this issue - [x] Add an [open license](https://mozilla.github.io/open-leadership-training-series/articles/get-your-project-online/sharing-your-work-in-the-open/) to your repository as a file called `LICENSE.md` - [x] Add a [Code of Conduct](https://mozilla.github.io/open-leadership-training-series/articles/building-communities-of-contributors/write-a-code-of-conduct/) to your repository as a file called `CODE_OF_CONDUCT.md` - [ ] Invite new contributors to into your work! This issue is here to help you keep track of work as you start Open Life Science program. Please refer to the [OLS-2 Syllabus](https://openlifesci.org/ols-2) for more detailed weekly notes and assignments past week 4.
samvanstroud commented 3 years ago

Vison statement:

We want to build a community that encourages people to harness the potential of open data by creating socially relevant, reproducible and pedagogical data stories.

Openness is embedded in every step of this project - from the data to the content and to the community driven peer review process.

We want to create a platform through which authors can publish different data stories, whilst maintaining a high standard for openness and reproducibility. Our stories should provide pedagogical content and transparency in the process of extracting insight from the data. We hope this will develop our readers’ data literacy and help them better understand an increasingly data driven world.

samvanstroud commented 3 years ago

Link to repo:

https://github.com/alan-turing-institute/TuringDataStories/

mloning commented 3 years ago

Hi everyone, I was in a breakout room with one of your group last week and found the project really interesting. Here's some feedback on your vision statement:

Hope this helps 🙂

EKaroune commented 3 years ago

Yes, I would second the comments above about the language that you are using. I think you need to consider what audience you want your work to reach. The language you are using is very technical. I think you probably want to get more researchers and hopefully non-academics involved in the data and therefore you need to make your language more accessible to them. I don't think many researchers know what a data story is?

Ismael-KG commented 3 years ago

This sounds really exciting! I want to agree in part with the above; i.e.: what is a data story? Do you have any examples?

In either case, I wonder how data stories might work in The Turing Way. Maybe something interesting for the case studies subchapter? We'll be working on the Guide for Ethical Research during the OLS programme (our issue is here) 😄

kevinxufs commented 3 years ago

Thanks everyone for the feedback and I agree that we should be using more open and easy to understand language.

Our notion of 'socially relevant' is not very well defined - at the moment it just means the story is interesting and somehow relates to society in a way that people would care.

We've done some initial work on writing a story that covers deprivation and Covid19.

In terms of pedagogical, each of these stories are meant to be informative and are meant to teach the reader how to employ different data science techniques. Currently we're writing them in the form of a Jupyter notebook that walks the user through the process.

There's an important question regarding community and platform. I think our vision is that people will be able to publish through us. We haven't quite got this platform for what publishing through us looks like (maybe we need to build better links here too!). The community refers to the community of authors as well as our broader readership. We want to have a peer review process which will require community engagement.

Would be great to collaborate on the Turing Way - a lot of our principles were taken from it :)

samvanstroud commented 3 years ago

Our Compare and Contrast document

Our Open Canvas

samvanstroud commented 3 years ago

Thanks for all the feedback on our vision statement! @yochannah suggested this great tool which grades the complexity of text. We had 3/5 sentences being "very hard to read", so safe to say I think we can definitely do better in terms of readability. The tool might be of interest to others: http://www.hemingwayapp.com/ :)

evaherbst commented 3 years ago

Hi, I just had a look at your Open Canvas and it gives a nice clear overview of your project. I would like to know more about what exactly these data stories are and how they will be presented? Does "story" imply a narrative type of presentation, with a focus on the researcher? That might be cool to give the general public insights into different scientists and their work. Or will the story be focused solely on the data itself? Will there be visual elements to it? Will you focus on research that is published or unpublished?

koudyk commented 3 years ago

What a cool project. I would go to the website now if already existed!

I had a look at your Open Canvas. I noticed that you want to have a platform for interacting with the content, but that you didn't list anyone with technical skills under the "Contributor Profiles". Will you also need contributors who will help set up and maintain the website?

kevinxufs commented 3 years ago

Thank you both for the helpful comments :)

Hi, I just had a look at your Open Canvas and it gives a nice clear overview of your project. I would like to know more about what exactly these data stories are and how they will be presented? Does "story" imply a narrative type of presentation, with a focus on the researcher? That might be cool to give the general public insights into different scientists and their work. Or will the story be focused solely on the data itself? Will there be visual elements to it? Will you focus on research that is published or unpublished?

Part of our work will be establishing a bit more formally what one of our data stories will be. The idea is that we will set up some standards for data stories, and that people can go through a peer review process against our standard, in order to be published through us.

We're nearly finished with our first story, which is a reproducible jupyter notebook. The intention is that someone could read the notebook, run the code and also see our commentary on why we did certain things with the data and how it was done.

The 'story' side is our medium for presenting. We want the stories to be interesting and informative - relevant based on current societal issues. We use the story to show the potential of data science and use it as a motivator for teaching. Currently we're focusing more on the data, but it's an interesting question as to whether we should give a bit more emphasis to the researcher themselves. I think that's something we should consider in the future but is maybe a bit out of scope at the moment!

The story will have some visual elements - our current story uses a number of data visualisations, but I think there's more we can do than that. In terms of published / unpublished - that's also an interesting question. All the data that we use should be openly accessible, so someone could genuinely create their own scripts and reproduce our analysis. The analysis itself could be 'novel', and we think it is likely to be more interesting if it is. We aren't expecting super sophisticated analyses - our primary focus is still the teaching aspect and we want the readers to understand the different data science techniques.

What a cool project. I would go to the website now if already existed!

I had a look at your Open Canvas. I noticed that you want to have a platform for interacting with the content, but that you didn't list anyone with technical skills under the "Contributor Profiles". Will you also need contributors who will help set up and maintain the website?

I think this is a good point that we hadn't formally considered yet.

I think at the moment we're thinking about platform in a very high level way. We're not thinking of doing it programatically just yet. Instead, the model we'd be using is just using Github. We have a github repo where we manage story creation. We expect people to be able to suggest different stories to be written by writing an issue in our repository and eventually by creating pulll requests. These will then be peer reviewed, still through Github.

In the future we might consider doing some more formal web development.

Oh, also thank you for reminding us about the contributor profiles - it is actually outdated. I have some very technically skilled collegues! :)

baileythegreen commented 3 years ago

@kevinxufs This sounds like a great way to help the non-science community understand how some scientific results are found, and to introduce scientists to new concepts! I also really like your idea of using the Github issues as a path for people to suggest new topics/contribute/review stories. I need to figure out the different routes that make it easiest for people to be able to contribute to my project, and something like this might make the most sense.

To the question of the platform for your stories, do you envision this being one specific to your project? My project is an online repository for educational materials (a variety of media types, including some Jupyter Notebooks; fairly high-level, aimed at academics or advanced learners). The materials I have so far are generated by me, and so cover areas that I work in (biological sciences and programming), but I want to cover a broad range of topics. I'll be looking for contributors and other available sites/resources that I can also link to from my site/social media.

Your data stories sound like they would fit well with my project, if you'd be interested in some level of collaboration?

crangelsmith commented 3 years ago

Hi @baileythegreen, I was just looking at the issue of you project and indeed there are lots of things in common with the Turing Data Stories, I think we could definitely learn from each others projects and find some collaboration points.

To the question of the platform for your stories, do you envision this being one specific to your project?

We are still exploring this. Currently we just have jupyter notebooks that run on binder, but we would like a more polished platform like jupyter books or (fast pages)[https://fastpages.fast.ai/fastpages/jupyter/2020/02/21/introducing-fastpages.html], we are still deciding this point.

I'm not sure we will be able to talk in today's cohort call, but maybe we can organise a chat in the next days?

crangelsmith commented 3 years ago

Refactored vision statement: (After all comments and the Hemingway app)

We want to inspire an open community around a central platform. One that encourages us all to harness the potential of open data by creating 'data stories'. These 'data stories' will mix computer code, narrative, visuals and real world data to document an interesting insight or result. The stories will relate to society in a way that people care about and be educational. They must maintain a high standard of openness and reproducibility and be and approved by the community in a peer review process. The stories will develop data literacy and critical thinking in the general readership. The aim is to help us all understand the data driven world around us.

DavidBeavan commented 3 years ago

Consider flipping last sentence to start (thanks to Emmy in breakout room):

The aim is to help us all understand the data driven world around us. We want to inspire an open community around a central platform. One that encourages us all to harness the potential of open data by creating 'data stories'. These 'data stories' will mix computer code, narrative, visuals and real world data to document an interesting insight or result. The stories will relate to society in a way that people care about and be educational. They must maintain a high standard of openness and reproducibility and be and approved by the community in a peer review process. The stories will develop data literacy and critical thinking in the general readership.

baileythegreen commented 3 years ago

I'm not sure we will be able to talk in today's cohort call, but maybe we can organise a chat in the next days?

@crangelsmith I'll message you in the slack so we can set something up!

malvikasharan commented 3 years ago

@all-contributors please add @DavidBeavan for idea and content.

allcontributors[bot] commented 3 years ago

@malvikasharan

I've put up a pull request to add @DavidBeavan! :tada:

malvikasharan commented 3 years ago

@all-contributors please add @crangelsmith for idea and content.

allcontributors[bot] commented 3 years ago

@malvikasharan

I've put up a pull request to add @crangelsmith! :tada:

malvikasharan commented 3 years ago

@all-contributors please add @samvanstroud for idea and content.

allcontributors[bot] commented 3 years ago

@malvikasharan

I've put up a pull request to add @samvanstroud! :tada:

malvikasharan commented 3 years ago

@all-contributors please add @kevinxufs for idea and content.

allcontributors[bot] commented 3 years ago

@malvikasharan

I've put up a pull request to add @kevinxufs! :tada:

malvikasharan commented 3 years ago

@all-contributors please add @baileythegreen for review.

allcontributors[bot] commented 3 years ago

@malvikasharan

I've put up a pull request to add @baileythegreen! :tada:

malvikasharan commented 3 years ago

The rest of the contributors have already been acknowledged.

DavidBeavan commented 3 years ago

Our README.md for Turing Data Stories

DavidBeavan commented 3 years ago

Our LICENSE.md or licence in British English 😛 for Turing Data Stories

DavidBeavan commented 3 years ago

Our CODE_OF_CONDUCT.md for Turing Data Stories