Closed Manishearth closed 3 days ago
I wish we could avoid making PatternPlurals
a ULE.
This part of the datetime provider may be changing as part of #1317. In that issue, the skeletons will no longer be an unlimited map; there may be a limited number of possible skeletons, such that we can just make a big struct and let Postcard do the work instead of ZeroVec.
As a general rule, if we start having very deeply nested ZeroMaps, we should rethink the architecture of the data payload. Putting more into the Postcard part of the deserialization is also better for CrabBake, since CrabBake can bypass Postcard but it can't bypass ZeroVec.
As a general rule, if we start having very deeply nested ZeroMaps, we should rethink the architecture of the data payload. Putting more into the Postcard part of the deserialization is also better for CrabBake, since CrabBake can bypass Postcard but it can't bypass ZeroVec.
I strongly agree with this. Though one hazard with making it a big struct is that adding fields in the future is a breaking change. We could still make that work with an array but we're back to nesting zerovecs and needing ULEs.
I accidentally posted this issue before I was done but I was considering advocating for a case where we do more of the parsing at formatter construction time instead of mandating it be zero-copy. But if this is changing in https://github.com/unicode-org/icu4x/issues/1317, that works too.
If we're going for a big struct, we can block this on #1317 and move on. @gregtatum , do you have an idea of how long #1317 will take? Moving everything to zero-copy does block CrabBake, and while we can have components opt out of CrabBake I think it would be nice if DTF didn't need to.
Moving everything to zero-copy does block CrabBake, and while we can have components opt out of CrabBake I think it would be nice if DTF didn't need to.
My preference for a short-term solution, if we need it, would be to put skeletons behind a feature flag, so that DTF with default features is zero-copy.
Yeah I think I can just make it so that CrabBake's provider is unable to provide certain keys
Another option is to replace Skeleton with the unparsed skeleton string.
I'm not sure with Zero copy if this adds anything, as you would then have to parse the skeleton, and do similar validation work. I'm not as familiar with the implementation details of zero copy to really speak with expertise on it though.
I'm not sure with Zero copy if this adds anything, as you would then have to parse the skeleton, and do similar validation work. I'm not as familiar with the implementation details of zero copy to really speak with expertise on it though.
Yep; we don't get any zc perf benefit, it just enables crab-bake (which requires everything to be zerocopy)
I would also prefer to not go down the "data zero-copy but we allocate afterwards" route though
that looks good to me. I'd suggest making sure that the PatternPlurals
encapsulation is really good so that even internal consumers never think of it as an array and never have to reason about which plural category 3rd element is.
it just enables crab-bake
In that case that seems fine to me.
LGTM after reading the comments. I don't think I have anything else to add.
This needs to be removed
The data has been removed, and I'm removing the marker in https://github.com/unicode-org/icu4x/pull/5612
To track a longer-term home for these things, I created https://github.com/unicode-org/icu4x/issues/5611
part of https://github.com/unicode-org/icu4x/issues/876, part of https://github.com/unicode-org/icu4x/issues/856
DateSkeletonPatternsV1
is one of our final remaining non-zero-copy types. It's a rather complicated tree:There are two parts to this: firstly, the skeleton needs to be made zero-copy, and then PatternPlurals. Both need to be VarULE or ULE to work inside a ZeroMap.
Skeleton
Skeleton seems easy, we replace
Skeleton
with#[make_varule] struct Skeleton(ZeroVec<'data, Fields>)
. I'm a bit worried that this will make lookup rather slow (since Ord will be slower): but also as far as I can tell, we never use theLiteMap
as an actual map at use time, we only iterate the map in order.Another option is to replace
Skeleton
with the unparsed skeleton string.PatternPlurals
I'd rather not write a custom ULE type here, but ultimately, we can. I do think, however, we can get most of the benefits by restructuring this type a bit.
Basically, this type can be structured as
We will need an AsULE implementation on
Option<T: AsULE>
as well as a VarULE implementation onOption<T: VarULE>
in cases where we can guarantee thatT
never has length zero (this can be done with an additional trait).Then,
PatternPlurals::patterns
becomes a vector that must have at least one non-None element in the beginning, and the rest of the elements can be nonexistent or None (or we can guarantee that it either has length 1 or length 6).Thoughts?
Feedback requested: