I got these following few lines code to test StripAccents, but it turned out not working
"""
from tokenizers import normalizers
from tokenizers.normalizers import Strip, StripAccents
normalizer = normalizers.Sequence([Strip(), StripAccents()])
print(normalizer.normalize_str("Héllò hôw are ü? "))
"""
I can indeed reproduce but I think it is expected, this works: normalizer = normalizers.Sequence([normalizers.NFKD(), StripAccents()]). >>> Hello how are u? e
I got these following few lines code to test StripAccents, but it turned out not working """ from tokenizers import normalizers from tokenizers.normalizers import Strip, StripAccents normalizer = normalizers.Sequence([Strip(), StripAccents()]) print(normalizer.normalize_str("Héllò hôw are ü? ")) """