lilt / alignment-scripts

Scripts to preprocess training and test data and to run fast_align and giza
MIT License
109 stars 22 forks source link

alignment-scripts

Scripts to preprocess training and test data for alignment experiments and to run and evaluate FastAlign and Mgiza.

Dependencies

Usage Instructions

Results

All results are in percent in the format: AlignmentErrorRate (Precision/Recall)

German to English

Method DeEn EnDe Grow-Diag Grow-Diag-Final
FastAlign 28.4% (71.3%/71.8%) 32.0% (69.7%/66.4%) 27.0% (84.6%/64.1%) 27.7% (80.7%/65.5%)
Mgiza 21.0% (86.2%/72.8%) 23.1% (86.6%/69.0%) 21.4% (94.3%/67.2%) 20.6% (91.3%/70.2%)

Romanian to English

Method RoEn EnRo Grow-Diag Grow-Diag-Final
FastAlign 33.8% (71.8%/61.3%) 35.5% (70.6%/59.4%) 32.1% (85.1%/56.5%) 32.2% (81.4%/58.1%)
Mgiza 28.7% (82.7%/62.6%) 32.2% (79.5%/59.1%) 27.9% (94.0%/58.5%) 26.4% (90.9%/61.8%)

English to French

Method EnFr FrEn Grow-Diag Grow-Diag-Final
FastAlign 16.4% (80.0%/90.1%) 15.9% (81.3%/88.7%) 10.5% (90.8%/87.8%) 12.1% (87.7%/88.3%)
Mgiza 8.0% (91.4%/92.9%) 9.8% (91.6%/88.3%) 5.9% (97.5%/89.7%) 6.2% (95.5%/91.6%)

Known Issues