Is it possible to incorporate POS tag info to aid alignment?

neulab / awesome-align

A neural word aligner based on multilingual BERT

BSD 3-Clause "New" or "Revised" License

325 stars 47 forks source link

Hello and many thanks for sharing the project

I have an open question/discussion: would it be possible to incorporate the POS information of each token during training? For example, by having a new loss function that tries to minimize POS tag mismatching from source to target token. This comes from the idea that if a source token is a Noun in the source language, it will most likely also be a Noun in the target language. Same would go for Verbs etc. or other high-level POS tags. What are your thoughts on this?

Thank you

neulab / awesome-align

Is it possible to incorporate POS tag info to aid alignment? #53