cisnlp / simalign

Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)
MIT License
347 stars 47 forks source link

paragraph alignments #35

Closed yhifny closed 1 year ago

yhifny commented 1 year ago

I have paragraphs in German and English and I am searching to get which sentences in the source language are mapped to the target language sentences. Can u advise me about that?

pdufter commented 1 year ago

SimAlign cannot do this unfortunately. For searching keywords like "bitext mining" or "sentence alignment" might be useful. See as one example this paper that does crosslingual bitext mining: https://arxiv.org/pdf/1911.04944.pdf