Which ES function is used in RepoEval dataset?

Your work has significantly contributed to the field, and I am currently engaging with the RepoEval dataset to further my understanding based on your research.

While exploring the codebase associated with the paper, I came across two separate functions designed to calculate ES scores. During my experiments, I observed that each function yields different es_scores, which led to some confusion. To ensure the accuracy of my work and to better align with the methodologies of your study, may I kindly ask which specific function was utilized to report the results in your publication?

Function 1:

def cal_edit_sim(references, hypotheses):
    total = len(references)
    edit_sim = 0.0
    for pred, gt in zip(hypotheses, references):
        pred = pred.strip()
        gt = gt.strip()
        edit_sim += fuzz.ratio(pred, gt)
    return edit_sim / total

Function 2:

def cal_edit_sim_repoeval(references, hypotheses):
    total = len(references)
    edit_sim = 0.0
    for pred, gt in zip(hypotheses, references):
        pred = pred.strip()
        gt = gt.strip()
        if max(len(pred), len(gt)) == 0:
            continue
        edit_sim += 1 - editdistance.eval(pred, gt) / max(len(pred), len(gt))
    return edit_sim / total

amazon-science / Repoformer

Which ES function is used in RepoEval dataset? #4