reproducible-agile / reproducible-agile.github.io

Website for Initiative "Reproducible Research" @ AGILE conference
https://reproducible-agile.github.io/
10 stars 4 forks source link

2nd workshop 2018 - proposal #8

Closed nuest closed 6 years ago

nuest commented 7 years ago

The call for workshops is out. Deadline is 30 November 2017.

2017 proposal

Ideas for topics


Please feed free to add comments and more ideas here, I'll integrate them in this comment asap.

foost commented 7 years ago

from the Skype call:

nuest commented 7 years ago
nuest commented 7 years ago

A draft for the proposal is ready for contributions at https://github.com/o2r-project/agile-2017/blob/master/public/files/agile2018_proposal.md

nuest commented 6 years ago

more ideas: let's have a "hands-on" morning, and a more theoretical afternoon?

But there is a worth in having a more conceptual/theoretical discussion, might attract different people. Can we move that to some evening event?

Ideas for hands-on sessions

nuest commented 6 years ago

TUD thinks about RDM workshop, get back to them.

foost commented 6 years ago

150 minutes seem little time to teach enough for participants to be able to do anything meaningful. do you mean to split up the group as per expressed interests?

cgranell commented 6 years ago

the practical proposal is ok, but I'd like to have content more catchy in the sense of "actually" reproducing two published papers (one in R, the other in python). Using these running examples, participants will try to reproduce them, face actual problems in doing so, and at the same time will learn practical aspects of reproducibility in R (markdown, etc) and python (jupyter, ArcGIS scripting, etc)..

nuest commented 6 years ago

@cgranell So you suggest to prepare two papers and then go through the process of reproducing them? @MarkusKonk Has a lot of experience in that :-)

I like the idea because it is much better to prepare than a "bring your own data" approach.

Yes, I suggest to split up the groups. I would also strictly focus on the workflow/reproducibility aspects. So for R, we would assume people know R already, but introduce them to R Markdown (e.g. by creating an R Markdown from a PDF-paper with a script as @cgranell suggests) and even the rticles package so you have the publication ready for submission, and upload it to Zenodo Sandbox at the end. The same for Python+?-GIS: We're not teaching GIS, but show how you can construct a reproducible workflow instead of point-and-click.

@foost Would you prefer a full-day workshop? We can also cut down on the time for "Reproducibility at AGILE", just make that 15 mins. without discussion, would bring up the time for hands-on to 3 hrs.

nuest commented 6 years ago

I've updated the proposal a little bit to mention "publications", because we're not introducing workflow tools etc, but do have a focus on reproducible papers / reproducible publications.

This would shift the content for the GIS-stuff a bit: R Markdown also supports chunks in other programming languages, so it should be no problem to create a reproducible paper using QGIS/any Python library with it :-).

MarkusKonk commented 6 years ago

Right, in the last months we reproduced several papers that were published by Copernicus. It was pretty interesting not only because of the different types of errors but also regarding the output of the code, i.e. the figures which were rarely equal to those reported in the paper and oftentimes quite far away from that. I am currently discussing that in another paper but we can discuss it in the workshop, too.

cgranell commented 6 years ago

Yes, that's the idea for the hands-on session

Let's start simple. Imagine we identify a paper that 's elegible to be reproduced in R. We can split the participants into groups so that all groups try to reproduce the same paper, following the workflow you specified before. The session might combine brief explanations (how-to use R markdown, how-to...) mixed with group work. Spontaneous discussion are encouraged to discuss problems that groups face while getting data, finding scripts, downloading scripts, solving execution errors, etc... Each group also write down what they are doing (based on a form we provide). For example, if finding data was easy, difficult , etc; how long does it take (in minutes), which score (0-3) was given etc. During the last half an hour, we can reflect on the difficulties to reproduce a paper, supporting tools, and best practices. As a result, we would get an experimental assessment of our proposed criteria for reproducible research, which can be input for going on out work.

Additionally, some of us might take the role of "observer" to monitor and take notes about participants' behaviour while trying to reproduce papers ...I mean, reproducibility is not only technology; it requires mind shift, a different way of doing research, create a habit, which can be studied from a social perspective too, complemented the previous form .

Other configurations might be possible... Case 2: choose 2 papers and half of participants reproduce one , the other half the second paper.

If the above R experiment worked fine , we might extend it to python too, perhaps as a reproducibility course as part of the AGILE PhD School

Too ambitious??

nuest commented 6 years ago

I would not make this into a user study (how easy they find things) but tend to a more controlled version. But that are details we can figure out later. In the workshop description we could put something along the lines of

Participants explore the practical principles of reproducible papers by reproducing a provided real-world publication. Together with the instructors they create a reproducible document from public text/code/data and publish it in a data repository (and in a code repository/GitHub?).

We (@nuest @MarkusKonk) could do that for R. I do fear that if we use only R we might loose participants, so it would be great if we could offer the same (either with a Jupyter Notebook, or with Markdown + Python) for Python users. The Python example might include ArcGIS/QGIS, but we would for now not mention that.

@cgranell Totally agree on the "mind shift" aspect. We should try to teach a different way of working rather than teaching many features of tool X.

@foost What do you think? We might not appeal "regular GIS users", because you should have some experience in either R or Python, but having that as a prerequisite could solve your concerns wrt time.

We should eat our own dogfood and make the teaching material part of a GitHub repo und CC-BY, so of course AGILE PhD school is a reasonable future step to teach this.

MarkusKonk commented 6 years ago

To me, it sounds interesting.

hoferb commented 6 years ago

I agree completely with changing to reproducing existing work - or looking into how to make a 'research package' from existing material. Seems to me that both sides of the coin can be discussed.

(I just tried to reproduce the R script Daniel prepared but failed early on, when installing a package did not work. Very annoying. I would enjoy to learn more about reproducition using R Markdown ;)

nuest commented 6 years ago

Description updated, feedback welcome (especially on the newly added "Objectives" section).

MarkusKonk commented 6 years ago

Hi, The proposal reads well. Just a few comments on the description:

nuest commented 6 years ago

I have updated the workshop proposal because we were a lot above the 1000 character limit for the workshop abstract, which I forgot to put in my template... anyways, we do have a longer text ready for the website now :-)

I also uploaded the PDF for submission, just in case somebody wants to do a final check: https://github.com/o2r-project/agile-2017/blob/master/public/files/agile2018_workshop_proposal_reproducible-research-publications-at-AGILE.pdf

nuest commented 6 years ago

Workshop was submitted and accepted, see https://agile-online.org/programme-2018/agile-workshops-2018

image