愿景1：Evaluation Image Generation - Githubissues

ExpressAI / eaas_client

https://expressai.github.io/autoeval/

Apache License 2.0

0 stars 1 forks source link

愿景1：Evaluation Image Generation #10

Closed pfliu-nlp closed 2 years ago

pfliu-nlp commented 3 years ago

针对用户的每次evaluation，我们可以定义以下信息：

1. 定义一个json结构（configuration），这个json的结构可以是

  { "metrics":[],
    "analysis":""
    "confidence_interval":""
  }

sacrebleu里面有个类似的，我们可以参考：

https://github.com/mjpost/sacrebleu#json-output

{
 "name": "BLEU",
 "score": 20.8,
 "signature": "nrefs:1|case:mixed|eff:no|tok:13a|smooth:exp|version:2.0.0",
 "verbose_score": "54.4/26.6/14.9/8.7 (BP = 1.000 ratio = 1.026 hyp_len = 62880 ref_len = 61287)",
 "nrefs": "1",
 "case": "mixed",
 "eff": "no",
 "tok": "13a",
 "smooth": "exp",
 "version": "2.0.0"
}

2. 我们自动生成一个介绍evaluation setting的描述, 比如

Our systems are evaluated by SacreROUGE version 2.0, with XX, YY, ZZ. We use XX to calculate the confidence interval for each system.

也可以参考sacreblue: https://github.com/mjpost/sacrebleu#version-signatures