Open Nugine opened 1 year ago
That would be effectively the same as simdutf8::compat::from_utf8(value).and_then(|s| s.to_owned())
, yes?
Note that there was some discussion in the past about putting it in the standard library directly: https://www.reddit.com/r/rust/comments/mvc6o5/incredibly_fast_utf8_validation/
Thanks for the answer!
Ah I forgot the original problem.
String::from_utf8
converts Vec<u8>
to String
with validation. However, simdutf8
can check a slice but not a vec. You have to use String::from_utf8_unchecked
to bypass an extra copy. So there's still no safe replacement for that.
Looking into the implementation of from_utf8 this should be quite easy to add
#[inline]
pub fn from_utf8(input: &[u8]) -> Result<&str, Utf8Error> {
unsafe {
validate_utf8_basic(input)?;
Ok(from_utf8_unchecked(input))
}
}
and we just add
pub mod string {
pub use super::*;
#[inline]
pub fn from_utf8(input: Vec<u8>) -> Result<String, Utf8Error> {
unsafe {
validate_utf8_basic(&input)?;
Ok(String::from_utf8_unchecked(input))
}
}
}
Currently there is no safe relpacement for
String::from_utf8
in simdutf8. I think it is easy to add a function for this.