cisnlp / simalign

Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)
MIT License
345 stars 47 forks source link

Is there a way to adjust the threshold τ? #28

Closed creolio closed 2 years ago

creolio commented 2 years ago

image

I'd like to reduce the threshold so that fewer words are aligned since sometimes the words are aligned improperly. Is there a way to adjust this threshold?

masoudjs commented 2 years ago

Hi,

If you use the code in "scripts/align_files.py" for aligning files (instead of sentences), you can adjust the threshold. You should use the argument "--null-align" to set the parameter. It's a floating point between [0.0, 1.0] (default=1.0), which indicates the percentile of alignments that should be accepted.

If you have any other questions, please ask.