Open sffc opened 5 months ago
I think FromStr::from_str
should dictate naming. It is not an ideal name, but it's the name Rust uses, and we should follow convention.
I like the TinyAsciiStr
model: it has a FromStr::from_str
, a pub const fn from_str(s: &str) -> Result<Self>
, which shadows the trait method, and an analogous from_bytes
. Having an inherent from_str
shadowing the trait is good for two reasons: it can be const
, which the trait method cannot be, and it can be called without importing FromStr
(which is not in the prelude). I think this method should not be called try_from_str
, because it is FromStr::from_str
. Giving it a second name is confusing, and it removes the shadow, so callers would have to either use the different name, or import core::str::FromStr
.
So my suggestion is:
(const) fn from_bytes(&[u8]) -> Result<Self, E>
and FromStr
const fn from_str
fn from_str
anyway, for calling convenience(const) fn from_str(&str) -> Self
, (const) fn from_bytes(&[u8]) -> Self
FromStr
with Err = Infallible
~These methods would have to be renamed to follow that scheme:
icu::locid::{lots of types}::try_from_bytes
(2.0-breaking)icu::timezone::{CustomTimeZone, GmtOffset}::try_from_bytes
(2.0-breaking)zerovec::ZeroSlice::try_from_bytes
(util)These methods should be FromStr::from_str
anyway, will send a PR:
icu::experimental::relativetime::provider::SingularSubPattern::try_from_str
(experimental, provider)icu_pattern::Pattern::try_from_str
(util, unreleased)I roughly agree with Robert's stance with the caveat that I'm not a huge fan of having inherent methods named the same as trait methods that are implemented by the type (so the "name something the same to get it const" is not that great, for me)
I strongly prefer using try_
for fallible non-trait methods.
It is not an ideal name, but it's the name Rust uses, and we should follow convention.
I consider FromStr
to be a bit legacy not only because of the lack of try_
but also because its error type is called Err
instead of Error
as used in newer traits like TryFrom
, whose convention I prefer to follow.
Also, people don't usually call FromStr::from_str
directly because (1) it's not in the prelude and (2) people use .parse()
. I see it as a means to an end to be more consistent with the Rust ecosystem, but we should define our own constructors on top of impl FromStr
.
Anecdotally, I have found it misleading when the only constructor for various ICU4X types is impl FromStr
. I would rather look at the docs and see a try_from_...
and then I instantly know that's how to construct the thing.
const fn from_str
is interesting; I hadn't thought of that case in the OP. I think I prefer try_from_str
because the function shows up in our docs as a constructor.
with the caveat that I'm not a huge fan of having inherent methods named the same as trait methods that are implemented by the type
@Manishearth If you hold this position, should we reconsider https://github.com/unicode-org/icu4x/issues/4590?
(const) fn from_str(&str) -> Self
This is problematic because the signature is different than the trait function FromStr::from_str
. I think if a function shadows, it should have the same signature.
How about this revised proposal:
impl FromStr
pub fn try_from_bytes
-- const
if possiblepub fn try_from_str
for discoverability -- const
if possible.impl FromStr
with type Err = Infallible
pub from from_bytes
-- const
if possiblefrom_str
because the signature is different than the trait function. May optionally consider a name such as new_from_str
or from_id
or something.(const) fn from_str(&str) -> Self
This is problematic because the signature is different than the trait function FromStr::from_str. I think if a function shadows, it should have the same signature.
I'm fine not implementing FromStr in the infallible case.
Actually I'm wholly against FromStr
for infallible. from_str
should be an inherent method, and it should not implement the trait method because the signature is different than the inherent function, we don't have infallible unwrapping (so this will be annoying to use), and there is usually no "parsing", just rewrapping (cf UnvalidatedStr
), so this being available through str::parse
is weird.
Also, people don't usually call FromStr::from_str directly because (1) it's not in the prelude and (2) people use .parse(). I see it as a means to an end to be more consistent with the Rust ecosystem, but we should define our own constructors on top of impl FromStr.
We actually use from_str
in our docs a lot. From these three declarations, I vastly prefer the first:
let foo = Foo::from_str("foo")?;
let foo = "foo".parse::<Foo>()?;
let foo: Foo = "foo".parse()?;
I think try_from_utf8
is slightly better than try_from_bytes
as it makes a normalized API space for utf8/utf16 (which is what we'll want in some hot paths like Locale parsing) and aligns with https://doc.rust-lang.org/std/str/fn.from_utf8.html .
Suggestion:
try_from_utf8
/ try_from_utf16
+ TryFrom<&[u8]>
/ TryFrom<&[u16]>
from_utf8
/ from_utf16
+ From<&[u8]>
/ From<&[u16]>
FromStr
+ TryFrom<&str>/From<&str>
We don't need to implement all of them for all constructors - just normalize the naming scheme so that we can introduce the ones we find useful incrementally.
+1 on using _utf8
instead of _bytes
.
@Manishearth If you hold this position, should we reconsider #4590?
It's a very weakly held position, I think #4590 is okay. I'm also fine with the proposal here for similar reasons.
I'm working on locid now in context of icu_preferences, and once I'm done I'd like to unify and improve docs on handling of to/from u8/u16/str for all subtags for preparation for locid &[u16] handling.
Any thoughts on my proposal several comments above?
try
prefix makes sense when fallible. I think utf8
makes sense for consistency with the standard library and pairing with utf16
.utf8
, but it seems like it would make it less discoverable.FromStr
is nice to have. I don't actually like it much since it's not in the prelude and it doesn't have the try
prefixfrom_str
is bad. we should use it as little as possible, and this includes not using it in examples. I would be happy with this if we always used .parse
or try_from_str
but never from_str
.utf16
a requirement; it seems like it would be an implementation challenge that isn't jusatified all the timeTryFrom<[u8]>
as @zbraniecki suggested?try_into
is not a very ergonomic method, so if it's called as Foo::try_from(&[u8])
, they should just call try_from_utf8
TryFrom<[u8]>
implies that there is one way to build this from a [u8]
, but we have multiple ways, such as Postcard deserialization.FromStr
gives us parse
, which some users might expect, what does TryFrom<[u8]>
give us? I don't see a use case for using it genericallyTryFrom<PotentialUtf8>
. We can add this later, perhaps.Concrete proposal:
All types that are fallibly created from a string have the following functions:
try_from_str -> Result<Self>
try_from_utf8 -> Result<Self>
FromStr::from_str -> Result<Self>
(only ever called through parse
in documentation, but only if it can be done without turbofishes, otherwise Foo::try_from_str
)try_from_utf16 -> Result<Self>
(only if the impl would benefit from such a function)All types that are infallibly created from a string have the following functions:
from_str -> Self
from_utf8 -> Self
from_utf16 -> Self
(only if the impl would benefit from such a function)LGTM: @hsivonen @sffc @Manishearth @robertbastian
I didn't realize I signed off on the "(only ever called through parse in documentation, but only if it can be done without turbofishes, otherwise Foo::try_from_str)" -- I agree with the "only ever called through parse in documentation" but not necessarily the "only if it can be done without turbofishes"
LGTM with the same alternation as @sffc pointed ou in the last comment. I'm not convinced that turbofish specifier is bad enough to have a blank rule against it in docs.
What about this pattern:
pub struct Era(pub TinyStr16);
impl From<TinyStr16> for Era {
fn from(x: TinyStr16) -> Self {
Self(x)
}
}
impl FromStr for Era {
type Err = <TinyStr16 as FromStr>::Err;
fn from_str(s: &str) -> Result<Self, Self::Err> {
s.parse().map(Self)
}
}
Both the impl From<TinyStr>
as well as the impl FromStr
are kind of pointless, as TinyStr
already has try_from_utf8
/try_from_utf16
/try_from_str
/from_str
, and the field is public.
This is done for all stable components and utils, except for the datetime and plurals reference and runtime modules.
@zbraniecki WDYT about doing the facelift to those modules?
Also in 2.0 (can be stretch), apply this style to all stringy APIs such as canonicalize, normalize, ...
I think we should align them with the decision here.
try_from
functionsnormalize
accept AsRef<[u8]>
to be generic over &[u8]
and &str
, but that's not great, both documentation-wise, and because the lifetime gets lost through AsRef
plurals::reference
and datetime::reference
modules?doc(hidden)
currently, so let's hold off until we have a decision on #5181normalize
or normalize_str
? Maybe we say that we do _str
only for from
functions or functions that could take other types of argumentsConclusion:
foo(&str)
, foo_utf8(&[u8])
, foo_utf16(&[u16])
doc(hidden)
modules at this timeLGTM: @sffc (@robertbastian) @Manishearth
In our docs there are 95 and 63 hits for
from_bytes
andtry_from_bytes
(the former includes counts from the latter), and there's also a mix offrom_str
andtry_from_str
.We should try to fix this in 2.0. I suggest:
try_from_bytes
andFromStr::from_str
. In the cases where they are not fallible,from_bytes
is fine.Alternatively, we could rename everything to
try_from_utf8
instead oftry_from_bytes
.EDIT: I think my preference has changed to also include
try_from_str
constructors. See discussion.Feedback needed from: