Open sffc opened 1 year ago
In general, in the context of the current discussion, arguments of the form "X is better than Y so we should choose X and not Y"
Yes, strong agree. I'd like us to be talking about pros and cons, not comparing at this stage, because a comparison is not in and of itself a blocking argument.
I think a factor we haven't explicitly considered is whether this is a Rust-specific abstraction or something we also use over FFI.
You address this later, but to explicitly talk about this: We already use DiplomatStr over FFI, and that translates to something natively meaningful on the other side. Which means it's unlikely this abstraction will ever "escape" over FFI.
I think there are basically two options:
1. We agree as a group that the fully spelled out `PotentiallyInvalidUtf8` is fine
After thinking about this more, I'm OK with this. We have IDE autocomplete, etc., to deal with the identifier length. PotentialUtf8
works, too.
After a look at the bstr
issues, it's probably better to keep this in ICU4X and not try to change btsr
to fit ICU4X.
Summary of discussion with @Manishearth @echeran @sffc:
Just to post this somewhere:
I think it would not be completely unreasonable or inconsistent with Rust style to introduce the following type
pub struct MaybeUtf8(pub [u8]);
impl MaybeUtf8 {
pub unsafe fn assume_utf8(&self) -> &str { ... }
pub fn try_to_utf8(&self) -> Result<&str, Utf8Error> { ... }
}
The parallelism of MaybeUninit::assume_init
to MaybeUtf8::assume_utf8
is just not something I could close this thread without addressing first.
I think my main gripe with that is still the usage pattern, where the point of this type is actually that you can often use it without ever having to deal with validating or assuming UTF8, it's quite useful without those two. Because of that it feels very different from the stdlib unsafe helpers, which are more enumlike and the usage pattern is extremely stateful.
Currently we have zerovec::ule::UnvalidatedStr and zerovec::ule::UnvalidatedChar. For a while, we've been meaning to discuss a more final home and name for these types.
There's nothing really zerovec-specific about these types other than zerovec putting their use case more front and center. They are almost as useful for serde as they are for zerovec.
I'm not a huge fan of the "unvalidated" prefix; I would rather we avoid negations.
How about
schrodinger::SchrödingerStr
? (also re-exported without the diacritic)Discuss with:
Optional: