neulab / awesome-align

A neural word aligner based on multilingual BERT
https://arxiv.org/abs/2101.08231
BSD 3-Clause "New" or "Revised" License
325 stars 47 forks source link

How to get test results? #43

Closed QzzIsCoding closed 2 years ago

QzzIsCoding commented 2 years ago

Hi, thanks for your work. I downloaded the model[Ours (multilingually fine-tuned w/o --train_co, softmax) ]you released, but the test results in the Chinese and English directions are different from those described in the paper.

I used the following command: DATA_FILE=/share/awesome-align/examples/zhen.src-tgt MODEL_NAME_OR_PATH=/share/awesome-align/model_without_co OUTPUT_FILE=/share/awesome-align/out/output.src-tgt OUTPUT_WORD_FILE=/share/awesome-align/out/output.words.src-tgt

CUDA_VISIBLE_DEVICES=0 awesome-align \ --output_file=$OUTPUT_FILE \ --model_name_or_path=$MODEL_NAME_OR_PATH \ --data_file=$DATA_FILE \ --extraction 'softmax' \ --output_word_file=$OUTPUT_WORD_FILE \ --batch_size 32

python tools/aer.py examples/zhen.gold out/output.src-tgt

The test results were as follows: out/output.src-tgt: 69.3% (30.8%/30.6%/11385) F-Measure: 0.307

zdou0830 commented 2 years ago

Hi, the references are one-indexed. You can try the command:

python tools/aer.py ${reference_path} ${file_path} --oneRef

QzzIsCoding commented 2 years ago

@zdou0830 Hi, thanks for your reply. I got the same results as those described in the paper with this command.