acmsigsoft / artifact-evaluation

2 stars 4 forks source link

Replicated, Reproduced and functional? #7

Open jon-bell opened 4 years ago

jon-bell commented 4 years ago

Currently, there is some confusion on the "Results Replicated" and "Results Reproduced" badges. The process for reviewing, reporting, and retroactive badge awarding needs to be clarified. FSE 2018 and 2019, ICSE 2019 and 2020 do not award the "Functional" badge. Is this a good practice? Should we generally move away from this badge?

sbaltes commented 4 years ago

Are there any cases where "Results Replicated" was actually awarded? I think reproduction is possible during the review process (but may be time consuming). However, replication would require reviewers to execute an analysis, for example, using a new dataset. I don't think that is possible during artifact evaluation.

jon-bell commented 4 years ago

I'm not sure. I think that a related question is: what exactly are the minimum criteria for meeting the reproduced or functional bars?

wagnerst commented 4 years ago

I saw that happening once. If the paper was available as a pre-print, it could be already replicated. But maybe we need ways to apply for that badge even later on.

jkrinke commented 4 years ago

I just had an email from Tim Menzies about the ICSE2020 track and which made me think about the Replication / Reproduction Badges. Good replications and reproductions need a lot of work and we as a community should support the effort, for example, by R&R tracks. If such tracks become more widespread, then awarding the badges become straight forward: the authors of a replicated paper receive the badge when the replicating paper appeared (in an R&R track).

jon-bell commented 4 years ago

@jkrinke What about the model where: Author A writes an ICSE2020 paper. Author A publishes an artifact that allows for replication of this paper. Artifact evaluation committee independently replicates the paper result using the provided artifact, awards a badge?

jkrinke commented 4 years ago

A proper replication is too much work for an assigned replicator. A good replication will often lead to a paper on its own and it is the replication that should be honoured, not the replicated work. A committee will only be able to do the most simple work - does the provided artefact run? But that is not the point of a replication, as the provided artefact may not be what the paper describes. Thus, a proper replication would have to re-implement the approach by the description given in the paper. I believe the act or replication is of a great value to the community and we usually learn something else during the replication. See, for example, the programming language dispute at the moment.

I had my fair share of artefacts that were not what the paper promised, but figuring that out usually took weeks of investigation.

jon-bell commented 4 years ago

@jkrinke In my interpretation, you are describing a reproduction (re-implementing the approach), based on ACM's definitions:

Replicated: Available + main results of the paper have been obtained in a subsequent study by a person or team other than the authors, using, in part, artifacts provided by the author.

Reproduced: Available + the main results of the paper have been independently obtained in a subsequent study by a person or team other than the authors, without the use of author-supplied artifacts.

How would you distinguish between a replication and a reproduction?

jkrinke commented 4 years ago

True, I was under the impression that Claerbout's terminology is used.

I have now read the ACM's definition and it requires a peer-reviewed publication reporting on the replication or reproduction.