Open lkurlandski opened 2 weeks ago
Hello. I think there are some problems with NormalizedString (tokenizers 0.15.2).
NormalizedString
In the following example, append() works as expected.
append()
from tokenizers import NormalizedString s = NormalizedString("Hi.") # NormalizedString(original="Hi.", normalized="Hi.") s.append("Hello.") # NormalizedString(original="Hi.", normalized="Hi. Hello.")
After using clear(), append() no longer modifies the normalized attribute.
clear()
normalized
from tokenizers import NormalizedString s = NormalizedString("Hi.") # NormalizedString(original="Hi.", normalized="Hi.") s.clear() # NormalizedString(original="Hi.", normalized="") s.append("Hello.") # NormalizedString(original="Hi.", normalized="")
This is also a problem with prepend.
prepend
Indeed, would you like to have a go at it and open a PR ? 🤗
Hello. I think there are some problems with
NormalizedString
(tokenizers 0.15.2).In the following example,
append()
works as expected.After using
clear()
,append()
no longer modifies thenormalized
attribute.This is also a problem with
prepend
.