ReScience / ReScience-article-2

Repository for the upcoming ReScience article
6 stars 14 forks source link

Changes after review #50

Closed khinsen closed 6 years ago

pdebuyl commented 6 years ago

I have a (maybe) stronger view on the "controlled environments" part of the review. What I want to convey is that I believe our process is stronger than a controlled environment. Often, the reviewers will have a different OS, a different release of the programming environment (Python, C compiler, etc) and thus we stand a better chance at catching hidden dependencies, platform specific code, etc.

A better solution than containers, in my opinion, would be to run the code on a "matrix of environments" (Mac - Windows - Linux - *BSD - Cray), (Python 3.5, 3.6) , (32bit, 64bit), (SciPy 0.19, 1.0), etc, in a combinatorial fashion (Python is just used as an example, change to versions of R, gcc, etc as needed). Debian's multiarchitecture build system comes to mind also. This is however impractical and would drive contributors away... I see with the SciPy lecture notes how annoying it is to even manage a few environments (we do Python 2 and 3, there is travis, my desktop and my laptop).

I don't know of scientific evidence in this direction however.

khinsen commented 6 years ago

@pdebuyl I wouldn't say better than controlled environments, but different, testing something else: the robustness of the code under variations of the environment. Which is indeed an important point in practice, although it is hard to formalize. I will add a sentence or two to the article.

heplesser commented 6 years ago

@khinsen @pdebuyl :+1: for the changes to the manuscript. Concerning controlled environments or a matrix of environments, I think we should be realistic what we (a) can provide and (b) authors are willing to put up with. The number of combinations possible is staggering, especially if (which I very much hope!) ReScience in the future attracts contributions from a wider range of fields using a wider range of programming languages, libraries, and tools.

But I think this is not as bad as it may seem: A key advantage of a ReScience publication is that it provides a definite implementation of the model that has been confirmed to work by at least two independent reviewers. If the code later fails to work (or produce the exact same results), one at least has a definite implementation of record on which to start searching for the cause (perform "model archeology"). This is a significant step forward compared to the situation today where you do not even know where to begin. Maybe we could add a sentence about this near the end of the "Short-term and long-term reproducibility" section?

From a technical point, we may want to require that authors and reviewers submit a log of their reproduction runs containing as much version information as possible on software, libraries and hardware. This information will most likely never be complete, but all information provided will constrain the search space when code fails at a later point.

khinsen commented 6 years ago

@heplesser I very much agree with what you say, but there is no need to add it to the manuscript, as it is already there! Reviewer #1 considers this a defeatist attitude, which I think is not true and that's what I wanted to make clearer.

Your last suggestion is very interesting (log of reproduction runs) and I think we should try this out in practice. I'd start with "suggesting" rather than "requiring", then see how it works out. Overall, reproduction runs play almost no role in our reviews at the time. Reviewers simply confirm that they have succeeded, but do not provide any details.

pdebuyl commented 6 years ago

@khinsen The update and the reply are fine, my comment was stronger than what I wished to appear.

rougier commented 6 years ago

@khinsen Thanks for the update and reply. Can you also add the PDF before I merge ?

khinsen commented 6 years ago

@rougier Done. I also added a latex-diff PDF to make the changes more evident.

Everyone: I would appreciate someone looking at the replies to the reviewers to check if the tone and level of detail is appropriate.

rougier commented 6 years ago

Thanks. Looks good to me but a sentence about continuous integration (I added a comment).