alan-turing-institute / ReproducibleResearchResources

This repository contains information to help you make your research reproducible
Creative Commons Attribution 4.0 International
10 stars 1 forks source link

Plan and announce lunch topics for early 2019 #14

Closed LouiseABowler closed 5 years ago

LouiseABowler commented 5 years ago

Time to get started planning next term's talks!

We've also got a couple of topic ideas left over from this term:

LouiseABowler commented 5 years ago

Schedule (all Mondays)

Jan 14th: Cancelled due to all-staff meeting; replaced with pizza on afternoon of Tuesday 15th 🍕 Jan 28th: Kirstie - FAIR data Feb 11th: Stephen - CODECHECK Feb 25th: Paolo - Provenance for Data Science March 11th: Louise - Reproducible Workflows March 25th: Kirstie - Contributing to the Turing Way April 8th: Natalie - MAPS: Mapping the analytical paths of a crowdsourced data analysis

LouiseABowler commented 5 years ago

I'm going to pencil myself in for the 25th Feb; I'd like to do a session on examples of reproducible papers. It would be really interesting to get some suggestions from the people who turn up and make some comparisons across different fields, in terms of what the normal level of data/code sharing and reproducibility is, and also what ideas authors have for sharing their research outputs.

Kirstie, the two ideas we had from last term are both sessions you were thinking of leading. Are there any dates that you'd like to do?

sje30 commented 5 years ago

I'd like to present my CODECHECK ideas to get feedback on how to pursue it within the community. (This was a project I submitted to the Wellcome Trust Open Research fund, but no funding yet).

LouiseABowler commented 5 years ago

Thanks @sje30, it'd be great to have a session on CODECHECK! The dates are very likely to be those above, but I'm still getting them confirmed - I'll let you know as soon as they are and then we can fix a session for you.

KirstieJane commented 5 years ago

Dates look good to me - lets just confirm them now 💯

@PaoloMissier (https://www.turing.ac.uk/people/researchers/paolo-missier) has agreed to lead a session on provenance for data science. I'll ping him an email with a link to this issue to help find a date.

Lets drop codes of conduct for now - I don't think there's a huge amount of interest 😭. Happy to do FAIR data though 😸

PaoloMissier commented 5 years ago

Hi, i'd like to take March 11th ( unless Feb. 25th is released in the meantime) to make my case on how provenance metadata can help reproducibility (an easy case to make!) and data science practices more specifically.

LouiseABowler commented 5 years ago

Thanks @PaoloMissier! Looking forward to your talk :) If you do want to take the 25th, I'm happy to switch over to another day.

@sje30, do you want to pick a date for your session? @KirstieJane, can you fit one in on FAIR data?

sje30 commented 5 years ago

I can do feb 11.

PaoloMissier commented 5 years ago

@LouiseABowler Feb. 25th is good for me, thanks! also I am hoping to get a colleague from Southampton involved, who is a real provenance expert and has a keen interest in collaborating. Is that ok?

LouiseABowler commented 5 years ago

Schedule updated - thank you @sje30 and @PaoloMissier! We'll send some notifications around the Turing once everyone is back in January. Could you both please send over a short description of your session (just a couple of sentences) that we can use for publicity?

Paolo, it's fine for us if you want to invite your colleague along.

PaoloMissier commented 5 years ago

Morning & Happy New Year everybody — here’s a short abstract for the lunch drop-in. What’s the format usually?

Thanks! —Paolo

Provenance is a form of structured metadata that formally describes the interrelated execution steps involved in a data generation process, along with a specification of the data involved [1,2]. Over the past 15 years or so, the provenance community has been developing robust approaches for capturing, storing, and querying provenance traces about many kinds of processes. In data-driven science, where new knowledge insights are generated through computational experiments, the case has been made (including as part of a Alan Turing scoping workshop in 2016 [3]) to use provenance traces to help validate and reproduce the experiment, and thus to improve the trustworthiness and quality of the result [4,5]. In this session we are going to explore the connection between provenance and data science. This can be articulated as two complementary questions.

  1. Provenance for Data Science: how can provenance theory and systems benefit data science? Why would it make sense to instrument tools used by data scientists with provenance recording capabilities, and how can that be done?
  2. Data Science for provenance: how can data science techniques, i.e., for descriptive and predictive data analytics, be applied to the analysis of large-scale provenance traces?

With a focus on reproducibility of data-driven science, both questions can be seen as provocations designed to create an engaging session.

[1] Moreau L, Missier P, Belhajjame K, et al. PROV-DM: The PROV Data Model. (Moreau L, Missier P, eds.).; 2012. [2] Missier P, Belhajjame K, Cheney J. The W3C PROV family of specifications for modelling provenance metadata. In: Procs. EDBT’13 (Tutorial). Genova, Italy: ACM; 2013. [3] Burgess LC, Crotty D, de Roure D, et al. Alan Turing Intitute Symposium on Reproducibility for Data-Intensive Research -- Final Report. 2016. [4] Chapman, Adriane, et al. "Plus: A provenance manager for integrated information." Information Reuse and Integration (IRI), 2011 IEEE International Conference on. IEEE, 2011. [5] Chirigati FS, Troyer M, Shasha D, Freire J. A Computational Reproducibility Benchmark. {IEEE} Data Eng Bull. 2013;36(4):54-59.

On 20 Dec 2018, at 10:46, Louise Bowler notifications@github.com<mailto:notifications@github.com> wrote:

Schedule updated - thank you @sje30https://github.com/sje30 and @PaoloMissierhttps://github.com/PaoloMissier! We'll send some notifications around the Turing once everyone is back in January. Could you both please send over a short description of your session (just a couple of sentences) that we can use for publicity?

Paolo, it's fine for us if you want to invite your colleague along.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/alan-turing-institute/ReproducibleResearchResources/issues/14#issuecomment-448953825, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ABgq_4pniHFHfGXOlgh_WUu0Nen5Lm4fks5u62qRgaJpZM4YyrhT.

LouiseABowler commented 5 years ago

Thanks for the abstract @PaoloMissier!

The sessions are quite informal; we usually sit around the big tables in the second floor seminar rooms so it's easy to get questions and discussion going. There is a screen for presentations there if you want to use it, but we've had a mix of formats before (demos, discussions, presentations) so we'll leave that up to you.

We've booked the room for an hour (1-2pm) and I'd say leave around half of that for questions - we're a talkative bunch! Last term we had around 10-18 people attending, usually a mix of members of the Research Engineering Group, PhD students and postdocs.

PaoloMissier commented 5 years ago

Perfect :-)

I am hoping to bring along Adriane Chapman, from Southampton.

Btw is anyone around the next few days? Coming from Newcastle I need to arrange travel carefully, but I may be around Wed/Thu if anyone is up for an informal chat.

On 8 Jan 2019, at 11:20, Louise Bowler notifications@github.com<mailto:notifications@github.com> wrote:

Thanks for the abstract @PaoloMissierhttps://github.com/PaoloMissier!

The sessions are quite informal; we usually sit around the big tables in the second floor seminar rooms so it's easy to get questions and discussion going. There is a screen for presentations there if you want to use it, but we've had a mix of formats before (demos, discussions, presentations) so we'll leave that up to you.

We've booked the room for an hour (1-2pm) and I'd say leave around half of that for questions - we're a talkative bunch! Last term we had around 10-18 people attending, usually a mix of members of the Research Engineering Group, PhD students and postdocs.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/alan-turing-institute/ReproducibleResearchResources/issues/14#issuecomment-452264326, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ABgq_1_vO6i0zKKEvGaWmUmIO2QAmcF9ks5vBH8NgaJpZM4YyrhT.

LouiseABowler commented 5 years ago

@PaoloMissier, I'm so sorry for missing your previous message - too many GitHub notifications in one day! I'm around and free for the rest of the afternoon if you do happen to be in today and want to meet for a chat.

PaoloMissier commented 5 years ago

Hi, for the session today: the Turing held a series of scoping workshops around 2016 and one of them was on Reproducibility for Data-Intensive Research. I am sure you are familiar with it but in case you have missed it, I think it provides a good starting point for our session today -- especially the session on Data Provenance to support Reproducibility available at: https://osf.io/bcef5/ (hopefully open for reading)

-Paolo

PaoloMissier commented 5 years ago

Hi everybody a quick follow up from our drop-in on provenance last week (time flies) --thank you all for coming. l have linked a zipfile here with the jupyter pandas notebook I used for the talk to illustrate "application-level provenance". if you: pip install prov then everything should work housing-PDA-PROV.zip

I can also collate slides if people are interested, but wouldn't risk breaking etiquette here :-) hope to keep this thread going, hopefully leading to a contribution to your e-book -- James Cheney and Adriane Chapman would be ideal partners for pursuing provenance-related activities at the Turing --Paolo

sje30 commented 5 years ago

hi all, will there be a RR talk next Monday?

LouiseABowler commented 5 years ago

Hi @sje30! Yep, we will be having a talk on Monday - @KirstieJane is going to lead a session on contributing to the Turing Way. Hope to see you then if you're free, and thanks again for all your input last time 😁

sje30 commented 5 years ago

thanks. I'm unable to make it in person, but if I'm free will ping again for the zoom connection!

PaoloMissier commented 5 years ago

Hi guys, Can you please send zoom details as I am not in London but I’d like to tune in

Cheers, —Paolo

On 24 Mar 2019, at 08:13, Stephen Eglen notifications@github.com<mailto:notifications@github.com> wrote:

thanks. I'm unable to make it in person, but if I'm free will ping again for the zoom connection!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/alan-turing-institute/ReproducibleResearchResources/issues/14#issuecomment-475937829, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ABgq_y-QjigQ9jUuqT1iOoU6po947k6Aks5vZzPHgaJpZM4YyrhT.

KirstieJane commented 5 years ago

I'm going to close this issue because I've created #19. If you're following along and would like to present in a slot, please comment there!