Closed jiangweiatgithub closed 1 year ago
We have not investigated a confidence for alignments yet. I guess a straight-forward approach would be to consider the (average) similarity of the aligned words. I do not have time to investigate this in the next week, but feel free to create pull requests.
If it is of any help to you, @jiangweiatgithub, there is already a pull request that does this (#4).
We have not investigated a confidence for alignments yet. I guess a straight-forward approach would be to consider the (average) similarity of the aligned words. I do not have time to investigate this in the next week, but feel free to create pull requests.
Do you have any recommendations or tool suggestions on how to calculate the similarity of one specific word in one language to a specific word in a different language?
Here is an interesting article, which provided complete python code: https://www.tensorflow.org/hub/tutorials/cross_lingual_similarity_with_tf_hub_multilingual_universal_encoder
@creolio not sure whether I understand your question correctly. SimAlign creates alignments based on these similarities. Maybe feeding just two words, e.g., "cat" and "Katze" to SimAlign and looking at the output of the get_similarity method might suit your purpose?
@creolio not sure whether I understand your question correctly. SimAlign creates alignments based on these similarities. Maybe feeding just two words, e.g., "cat" and "Katze" to SimAlign and looking at the output of the get_similarity method might suit your purpose?
Thank you. I had to put this aspect of the project on pause, but when I get back to it, I'll try this out. Sounds about right, tho.
I would need this feature in order to find out about possible mis-aligned words? Thanks!