davidkim205 / translation

11 stars 3 forks source link

komt-1810k-test 데이터셋 모델 및 src별 bleu score 검증 #7

Closed sudog1 closed 7 months ago

sudog1 commented 7 months ago
Model bleu self bleu null wrong length duplidate
cloud google api 0.400 0.491 0 3 0
cloud deepl api 0.394 0.451 0 1 0
cloud azure api 0.400 0.486 0 2 0
huggingface iris-7b(our) 0.398 0.435 0 3 0
huggingface TowerInstruct 0.319 0.350 0 7 1
huggingface madlad400 0.295 0.377 0 6 3
huggingface gugugo 0.312 0.364 1 7 0
huggingface mbart50 0.056 0.052 139 139 0
huggingface nllb200 0.262 0.297 0 3 3

iris_7b 및 checkpoint 순위

translation

  1. 105000 - 0.401
  2. 120000 - 0.400
  3. 115000 - 0.395
  4. 95000 - 0.395
  5. 55000 - 0.386
  6. 70000 - 0.380

translation_self

  1. 95000 - 0.437
  2. 105000 - 0.436
  3. 55000 - 0.436
  4. 115000 - 0.435
  5. 120000 - 0.433
  6. 70000 - 0.432

모델별 평균

translation(text와 re_trans비교)

Model avg_bleu_score
TowerInstruct 0.35
google 0.49
gugugo 0.37
mistral 0.34

translation2(trans와 label비교)

Model avg_bleu_score
TowerInstruct 0.32
iris_qwen_7b 0.24
google 0.4
madlad400 0.3
iris_qwen_4b 0.07
gugugo 0.31
mbart50 0.06
nllb200 0.27
iris_mistral 0.32

src별 평균

translaiton(text와 re_trans비교)

Dataset Score
aihub-MTPE 0.39
aihub-techsci2 0.37
aihub-expertise 0.27
aihub-humanities 0.33
sharegpt-deepl-ko-translation 0.51
aihub-MT-new-corpus 0.38
aihub-socialsci 0.39
korean-parallel-corpora 0.34
aihub-parallel-translation 0.33
aihub-food 0.44
aihub-techsci 0.45
para_pat 0.4
aihub-speechtype-based-machine-translation 0.44
koopus100 0.44
aihub-basicsci 0.33
aihub-broadcast-content 0.34
aihub-patent 0.31
aihub-colloquial 0.48

translation2(trans와 label비교)

Dataset Score
aihub-MTPE 0.4
aihub-techsci2 0.26
aihub-expertise 0.21
aihub-humanities 0.22
sharegpt-deepl-ko-translation 0.39
aihub-MT-new-corpus 0.3
aihub-socialsci 0.27
korean-parallel-corpora 0.09
aihub-parallel-translation 0.23
aihub-food 0.35
aihub-techsci 0.28
para_pat 0.23
aihub-speechtype-based-machine-translation 0.32
koopus100 0.17
aihub-basicsci 0.18
aihub-broadcast-content 0.28
aihub-patent 0.15
aihub-colloquial 0.24

모델별, src별 평균

translaiton(text와 re_trans비교)

<!DOCTYPE html>   google_translation gugugo iris_mistral TowerInstruct
average_bleu_score 0.49 0.37 0.34 0.35
         
aihub-MTPE 0.47 0.39 0.34 0.37
aihub-techsci2 0.45 0.29 0.39 0.35
aihub-expertise 0.41 0.21 0.2 0.27
aihub-humanities 0.44 0.31 0.3 0.27
sharegpt-deepl-ko-translation 0.58 0.62 0.52 0.34
aihub-MT-new-corpus 0.49 0.34 0.38 0.3
aihub-socialsci 0.53 0.33 0.34 0.36
korean-parallel-corpora 0.41 0.33 0.23 0.41
aihub-parallel-translation 0.43 0.27 0.27 0.34
aihub-food 0.55 0.43 0.42 0.36
aihub-techsci 0.48 0.4 0.49 0.45
para_pat 0.51 0.32 0.33 0.45
aihub-speechtype-based-machine-translation 0.61 0.43 0.32 0.41
koopus100 0.52 0.39 0.42 0.44
aihub-basicsci 0.44 0.27 0.32 0.29
aihub-broadcast-content 0.45 0.3 0.28 0.31
aihub-patent 0.51 0.49 0.11 0.14
aihub-colloquial 0.57 0.46 0.43 0.44

translation2(trans와 label비교)

  google_translation gugugo iris_mistral TowerInstruct
average_bleu_score 0.4 0.31 0.32 0.32
         
aihub-MTPE 0.62 0.45 0.41 0.46
aihub-techsci2 0.4 0.28 0.33 0.3
aihub-expertise 0.32 0.25 0.24 0.27
aihub-humanities 0.32 0.23 0.24 0.26
sharegpt-deepl-ko-translation 0.59 0.67 0.52 0.41
aihub-MT-new-corpus 0.44 0.32 0.44 0.36
aihub-socialsci 0.46 0.35 0.29 0.36
korean-parallel-corpora 0.14 0.11 0.11 0.14
aihub-parallel-translation 0.39 0.29 0.31 0.34
aihub-food 0.59 0.43 0.37 0.47
aihub-techsci 0.43 0.34 0.39 0.42
para_pat 0.35 0.24 0.35 0.3
aihub-speechtype-based-machine-translation 0.46 0.38 0.47 0.43
koopus100 0.23 0.23 0.2 0.21
aihub-basicsci 0.28 0.25 0.24 0.23
aihub-broadcast-content 0.47 0.41 0.38 0.32
aihub-patent 0.4 0.16 0.21 0.16
aihub-colloquial 0.36 0.26 0.32 0.33