Open ghost opened 3 years ago
Thanks for the report - yes, this kind of thing is always possible with arbitrary byte patterns. Not sure what the policy here should be.
I think we need gsub_bytes
- there's already gsub_bytes_with
.
Although what the format of the second replacement argument is - not yet sure!
I think that any function that handles strings should just be a wrapper around the one which handles bytes but then calls vec.into()
. such as
gsub(&mut self , text: &str , repl: &str) -> String{
self.gsub_bytes(text.as_bytes() , repl.as_bytes).into()
}
this is a zero copy operation and only needs to scan for invalid bytes
also does the functions passed to gsub_*_with really need to return an owned type could they not return a &[u8] or &str
also does the functions passed to gsub_*_with really need to return an owned type could they not return a &[u8] or &str
We can do what the regex
crate does, and return an appropriate Cow
to optimize the case where no substitution took place.
I think that any function that handles strings should just be a wrapper around the one which handles bytes but then calls
vec.into()
. such as
This is pretty much how gsub
is implemented now - only really have to do the bytes case!
There is also gsub_checked
which returns a Result
, to avoid the panicking problem.
The ⚠ is made up of the bytes [226 , 154 , 160] removing the 160 causes the string to become invalid unicode