Unfortunately, by using iconv there is a greater chance that the resulting (converted) text is either short or longer. For example, by using the euro symbol (€) we artifically increase the length of the texts we are comparing:
This is problematic, as it will result in incorrectly alignments of <mark>. While this can be mitigated by carefully calculating offsets for the offsets this quickly makes it more difficult to keep maintaining this functionality. Especially when there need to be more of these exceptions.
Only using the transliterator with Any-Latin; Latin-ASCII seems to preserve the length of the comparing elements and allow for searching accented/special characters. There are characters that are not part of/exist in Latin-ASCII, however, these characters are probably never used in the setting of the association.
Unfortunately, by using
iconv
there is a greater chance that the resulting (converted) text is either short or longer. For example, by using the euro symbol (€) we artifically increase the length of the texts we are comparing:This is problematic, as it will result in incorrectly alignments of
<mark>
. While this can be mitigated by carefully calculating offsets for the offsets this quickly makes it more difficult to keep maintaining this functionality. Especially when there need to be more of these exceptions.Only using the transliterator with
Any-Latin; Latin-ASCII
seems to preserve the length of the comparing elements and allow for searching accented/special characters. There are characters that are not part of/exist inLatin-ASCII
, however, these characters are probably never used in the setting of the association.This is a bug fix for GH-1764.