minkull commented 4 years ago

https://github.com/researchart/rose6icse/tree/master/submissions/replicated/tuning-hyperparameters

Authors of original paper: Andrea Arcuri * Email: andrea.arcuri@kristiania.no Github ID: @arcuri82 Gordon Fraser Email: gordon.fraser@uni-passau.de Github ID: @gofraser

Authors of replication: Shayan Zamani * Email: shayan.zamani1@ucalgary.ca Github ID: @shayanzamani Hadi Hemmati Email: hadi.hemmati@ucalgary.ca

applying for "replicated"

AnAnonymousReviewer commented 4 years ago

Dear authors,

I am currently reading and trying to understand your replication. I have one question at this point: The original study also varies the search budget between 10,000 function call executions, 100,000 function call executions, and 1,000,000 function call executions. In your paper, you report a fixed search budget of 2 minutes.

Do you know what "function call executions" mean exactly, and if yes, what does it mean?

How many function call executions were done in your 2 minutes budget?

Thanks!

Best regards AnAnonymousReviewer

shayanzamani commented 4 years ago

Hi,

Thanks for your careful reviewing. The 2 minutes search budget is mentioned in the second case study of the original paper. We didn't use the number of function calls in our paper and just used the budget that they mentioned in their second case study where they wanted to tune 609 classes.

Feel free to ask if you had any other questions,

Shayan

On Feb 16, 2020, at 8:26 AM, An Anonymous Reviewer notifications@github.com wrote:

Dear authors,

I am currently reading and trying to understand your replication. I have one question at this point: The original study also varies the search budget between 10,000 function call executions, 100,000 function call executions, and 1,000,000 function call executions. In your paper, you report a fixed search budget of 2 minutes.

Do you know what "function call executions" mean exactly, and if yes, what does it mean?

How many function call executions were done in your 2 minutes budget?

Thanks!

Best regards AnAnonymousReviewer

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

timm commented 4 years ago

@AnAnonymousReviewer : what is your current call on "replicated"?

@anonymousICSEartifacts : looking forward to your review

AnAnonymousReviewer commented 4 years ago

Dear @timm, dear @shayanzamani,

I am currently struggling with the meaning of the "replicated"-badge.

First (and more directed @timm), I wonder whether the requirement for "reproduced" is that the artifact should also be "available" (the criteria on https://conf.researchr.org/track/icse-2020/icse-2020-Artifact-Evaluation#Call-for-Submissions state: "Replicated: Available + main results of the paper have been obtained in a subsequent study...". In that case, I am missing information on the availability of the artifact. Has the artifact been awarded an "available" badge before? Where? If not, I'll need more information to easily assess the availability (as described in the submission information under "available").

Second, the paper is neither a strict replication nor a reproduction, I guess.

The ACM webpage on the badges states:

"Replicability (Different team, same experimental setup): The measurement can be obtained with stated precision by a different team using the same measurement procedure, the same measuring system, under the same operating conditions, in the same or a different location on multiple trials. For computational experiments, this means that an independent group can obtain the same result using the author’s own artifacts." --> Here, different classes are used as an input, so it is not a strict replication of the original study. At the same time, a struct replication is probably not too useful in our field.

"Reproducibility (Different team, different experimental setup): The measurement can be obtained with stated precision by a different team, a different measuring system, in a different location on multiple trials. For computational experiments, this means that an independent group can obtain the same result using artifacts which they develop completely independently." --> This is not the case. We might refine this for ICSE to mean that the artifacts (i.e. the tool like EvoSuite) is reused, but that this artifact is tried with new data (on new projects in our case). Here, however, the replication considers the same dataset, although the authors select more classes from that dataset.

At the same time, I think it is also valuable to do some variations like the authors did. So I tend to be positive while not having completely understood.

Please feel free to let me know your interpretations!

Best regards, AnAnonymousReviewer.

JonDoe-ArtifactsTrack commented 4 years ago

Dear all, thanks for your submission. From what I understand you are requesting a "Replicated badge" for your paper which involves that the material be "Available" and that the main results of the paper have been obtained in a subsequent study by a person or team other than the authors using in part the artifacts provided by the authors.

In this moment i am unable to:

verify the Availability of the material because you have only uploaded the link to the paper and a short abstract. There is not actual material nor am i able to verify where i can access it.
access either of the papers. You place a weblink in your abstract which takes me to the page of Springer for the original EMSE paper and the Springer proceedings replication paper. However, unfortunately my university does not have free access to Springer publications and so i am not able to access either of the sources.

Please provide the information so i can complete my review.

to @timm and @minkull, i have the same doubt as another reviewer. Do we have to also check for Availability? because if that is the case, in the current state, there is no material uploaded to allow for checking that either.

thanks.

shayanzamani commented 4 years ago

Dear @AnAnonymousReviewer,

Thanks for your careful review. I totally understand your comment. I agree that the current definition of replication can be changed a little bit. Replications are here to cast doubt on an already proved statement. We may not address the same question exactly the same. But the question is exactly the same. In our case, we have mentioned that it is a partially replication and we agree that there are some differences with the original paper.

shayanzamani commented 4 years ago

Dear @JonDoe-icse20Artifacts ;

My paper is also available for free here. and in terms of availability, there is no artifact that I can share with you. I just did some experiments with the EvoSuite tool. The only thing that I can share is the summary file of my results which is already provided in the paper as well.

timm commented 4 years ago

I'm going to make an executive decision here

replicated=yes (indeed, it is only partial, but strict replication is ... well... to strict a requirement
reproduced=no (for the reasons stated above so clearly by @AnAnonymousReviewer )

AnAnonymousReviewer commented 4 years ago

Dear @timm,

the other issue raised by us, the reviewers, is that the replicated badge also requires availability of the artifact, as of now. We might want to change that definition of replicated and remove the requirement that the artefact is available, so that the paper already fits. But currently, this requirement does not seem to be fulfilled. Alternatively, the authors could make a fork of EvoSuite available at e.g. Zenodo to fulfill the availability requirement.

The second question I have is what artifact the "replicated" badge is awarded to. Actually, this paper is the replication, and the original paper about the tool, EvoSuite, itself, is now replicated. Maybe this is something to more carefully define for the next ICSE artifact track.

Best regards, AnAnonymousReviewer

researchart / rose6icse

tuning-hyperparameters #142

applying for "replicated"