unicode-org / icu4x

Solving i18n for client-side and resource-constrained environments.
https://icu4x.unicode.org
Other
1.37k stars 176 forks source link

Proc macros for AsULE and AsVarULE #1079

Closed Manishearth closed 2 years ago

Manishearth commented 3 years ago

Part of https://github.com/unicode-org/icu4x/issues/1082

Depends on https://github.com/unicode-org/icu4x/issues/1078

It would be nice to be able to write:

#[derive(AsULE)]
struct Simple {
  a: u32,
  b: u32,
}

#[derive(AsVarULE)]
struct Relation {
   something: u8,
   something_else: u32,
   data: VarZeroVec<String>
}

and have it generate

#[repr(packed)]
struct SimpleULE {
   a: u32::ULE,
   b: u32::ULE
}

#[repr(packed)]
struct RelationULE {
   something: u8::ULE,
   something_else: u32::ULE,
   data: [u8]
}
Manishearth commented 3 years ago

For enums I can imagine us having derive(ULE) apply repr(u8) and just do transmutes. For larger enums this will fail and we'll have to have derive(AsULE) instead, but enums that large are not common anyway.

Manishearth commented 2 years ago

This will likely be a proc macro attribute, not a custom derive, with the proc macro attribute generating the ULE type. E.g. #[gen_var_ule(FooULE)] or something. Writing the ULE type correctly can be tricky so we probably need the proc macro to generate that part too, not just the ULE implementation.

Splitting this into different tasks it may have to do:

For ULE types it would generate:

For VarULE types it would generate:

Manishearth commented 2 years ago

We may also need #[derive(ULE)] that allows wrapping ULE types with more ULE types, in case you wish to implement a custom conversion. Ideally you can also #[generate_serde(target)] on it to generate appropriate ser/de impls.

Manishearth commented 2 years ago

As for bit packing and enums, I think the way to do this is:

Dataful enums can then be done by mandating that all fields are marked with zerovec::bits and doing their own discriminantful bitpacking. VarULE for these will involve generating a more complex type, but it's still doable.

Manishearth commented 2 years ago

The design can be found in the design doc: https://github.com/unicode-org/icu4x/blob/main/utils/zerovec/design_doc.md#proc-macros