sharksforarms / deku

Declarative binary reading and writing: bit-level, symmetric, serialization/deserialization
Apache License 2.0
1.14k stars 55 forks source link

Sub type endian handling is counter-intuitive #501

Open Jimmy-Z opened 2 weeks ago

Jimmy-Z commented 2 weeks ago

First of all, thank you for this work.

Example A: if we specify endian only on the Main type, doesn't compile

use deku::prelude::*;

#[derive(DekuRead, DekuWrite)]
struct Sub {
}

#[derive(DekuRead, DekuWrite)]
#[deku(endian = "big")]
struct Main {
    data: Sub,
}

Example B: if we specify endian on both the Main type and Sub type, also doesn't compile

use deku::prelude::*;

#[derive(DekuRead, DekuWrite)]
#[deku(endian = "big")]
struct Sub {
}

#[derive(DekuRead, DekuWrite)]
#[deku(endian = "big")]
struct Main {
    data: Sub,
}

the doc example of endian about pass endian via ctx works, but what if we don't need this fanciness and be less verbose and just want to specify it manually?

there's no reason endian doesn't propagate in example A design wise.

there's no reason example B doesn't compile design wise.

Sorry for the criticizing tone.

wcampbell0x2a commented 2 weeks ago

Each DekuRead invocation only has context of it's struct/enum. It has no idea what the other types, and what they have implemented (such as ctx, endian and the like).

Now, I think a "global" ctx has been talked about for this project, but I don't remember where.

And also, with some work, a endian_inherit could be created, to do the endian = "endian", ctx = "endian: deku::ctx::Endian" automatically...

Jimmy-Z commented 2 weeks ago

Doesn't explain why example b doesn't compile? it's more like a bug than a design choice for me.

wcampbell0x2a commented 2 weeks ago

the endian = "big" top-level on a struct tells deku that every field is "big", Thus, data: Sub will be told that it has the ctx of Endian::Big.

The other struct Sub, needs to know to use that ctx as endian.

impl ::deku::DekuWriter<deku::ctx::Endian> for Sub vs impl ::deku::DekuWriter<()> for Sub {

You can use https://github.com/dtolnay/cargo-expand to see the difference in the emitted code.

Jimmy-Z commented 2 weeks ago

I'm not following, but I guess you're explaining why example a doesn't work instead of example b?

wcampbell0x2a commented 2 weeks ago

Nope, I was explaining how Example B works currently. You are instructing Sub to be Endian::Big, but deku sends in it's own ctx of Endian::Big.

Maybe this shows better? This way we ignore the ctx that deku sends Main.data, and it still uses the Endian::Big you instructed it to use.

use deku::prelude::*;

#[derive(DekuRead, DekuWrite)]
#[deku(endian = "big", ctx = "_endian: deku::ctx::Endian")]
struct Sub {}

#[derive(DekuRead, DekuWrite)]
#[deku(endian = "big")]
struct Main {
    data: Sub,
}

I'm open to suggestions on how to make this better, I've really just accepted that this is how deku works for a long time.

Jimmy-Z commented 2 weeks ago

Thanks for the explanation.

But do you agree with me on the title of this issue? I think at least the document should be updated with some note to reflect this.

As of "how to make this better", I have no idea, I'm basically oblivious to the black magic of rust macro.

sharksforarms commented 2 weeks ago

Thanks for bringing this up, I also agree it seems counter-intuitive, until you understand how it's implemented. Is there a better way? Maybe...? I'm not sure either.

@wcampbell0x2a has explained how the current design works, this is a common pattern for similar crates such as binrw

You can also see the original PR implementing this here: https://github.com/sharksforarms/deku/pull/61

This comes down to working within the constraints of the type system and macros. I'm open to ideas on how to make this better.

@Jimmy-Z do you have specific recommendations to help make the documentation better in the meantime?

wcampbell0x2a commented 2 weeks ago

What could be done, if an endian isn't added, then a __deku_endian is use. This default to Endian::default() for the types without ctx, such as from_bytes (to retain the current functionality). Then for all others such as from_reader_with_ctx, the default becomes Endian instead of (). I think this way we by default we use system endian unless specified with the endian attribute.

Jimmy-Z commented 2 weeks ago

I also agree it seems counter-intuitive, until you understand how it's implemented.

I don't agree with this, I think you're confusing "reasonable" with "intuitive"

For example 1, user would think, intuitively, when using Sub, it's default endian, when using Main, the Sub in it would inherit big endian, instead of doesn't compile.

For example 2, user would think, intuitively, it should compile.

Maybe you need to see this from the view of a user of the library.

Again, sorry for the tone, and thank you for your work, it's awesome despite this quirk.

sharksforarms commented 2 weeks ago

I also agree it seems counter-intuitive, until you understand how it's implemented.

I don't agree with this, I think you're confusing "reasonable" with "intuitive"

Right, intuitive is the wrong word.

Thanks for the feedback.

Jimmy-Z commented 2 weeks ago

As for how to document this, since this is not really an easy thing to explain, maybe just say "there're some quirks around member endian handling" with a link to this issue?

sharksforarms commented 2 weeks ago

As for how to document this, since this is not really an easy thing to explain, maybe just say "there're some quirks around member endian handling" with a link to this issue?

This issue doesn't really explain much to help a user in your situation; I'd prefer we update the crate documentation with more examples, or a section explaining how the mechanism to pass information from Parent -> Child works