jefffhaynes / BinarySerializer

A declarative serialization framework for controlling formatting of data at the byte and bit level using field bindings, converters, and code.
MIT License
290 stars 62 forks source link

[FEATURE REQUEST] FieldAlignment to not just be stream based (but Parent / Field based) #208

Open bevanweiss opened 1 year ago

bevanweiss commented 1 year ago

I have a situation in this CIP driver that I'm putting together where it wants to have a field padded to an even number of bytes. But this field might start at an odd offset in the stream, however the padding should still be only within the Field itself.

i.e.

public class Parent
{
    [FieldOrder(0)]
    public byte Length;

    [FieldOrder(1)]
    [FieldLength(nameof(Length))]
    [FieldAlignment(2, FieldAlignmentMode.Right)]
    public byte[] Data;
}

If Data has 25 bytes, then it should be padded to be 26 bytes in total space (but the Length field should remain at 25). In the current implementation, since the byte for Length provides an offset of 1 byte in the stream, the Data alignment is 'off'. It's padding when it shouldn't.

I would propose to extend the FieldAlignmentAttribute with a FieldAlignmentScope which would have options of 'Stream' (default and current behaviour), 'Parent' which would align with regards to the offset inside the parent, 'Field' which would only align/pad internally to this Field (how I'd like it to operate).

Is there perhaps something that I'm missing that could already handle what I'm after? (without falling back to entirely custom serialization/deserialization).

It looks like the current Align(...) method only gets the BoundedStream, which has the stream.RelativePosition member. The simplest approach I can think of would be to follow along with a .FieldRelativePosition, and then a .ParentRelativePosition along with a stack to hold previous .ParentRelativePositions (pushed each time descending into a class, and popped when coming back out). This approach seems like it would be relatively expensive in performance though (3x increase in position tracking calculations).

jefffhaynes commented 1 year ago

I might be thinking about this wrong, but that feels a bit more like field "quantization" or something vs. alignment, no? Alignment to me is more like where you want the data to show up in the stream vs rules that the length should follow. Can you do this just using a value converter on the length binding that does a mod operation? I haven't actually tried it but it feels like that should work...

bevanweiss commented 1 year ago

Can you do this just using a value converter on the length binding that does a mod operation?

I'll give it a try. My concern however is that if the Length says 25, the field content (ASCII'ish characters) is only 25 elements long. But because that's an odd number, it's padded with an additional byte. So it's definitely a padding / right alignment problem. But only within that particular field. It ignores the current offset into the stream.

So my concerns with using FieldLength is: a. Deserialization and +1 on odd FieldLengths would result in an additional byte being assigned to the field (string), which 'should' be 0x00 (padding byte), but in bad implementations it might be the last residual byte in that memory location... since as per the spec it should be ignored anyway. b. Serialization seems like it's 'too late', the Field data will have already been determined before the FieldLength converter is called, at which point it's too late to do a +1 on the serialized length of the Field for the padding byte. Maybe it's not too late and this is my misunderstanding though.

bevanweiss commented 1 year ago

The Length tweaking didn't appear to work. But perhaps I was going about it wrong. When the Converter function is called on the FieldLength / FieldAlignment for the 'data field' neither appear to be able to add any additional padding character to the 'data field' itself. They can only alter the FieldLength aspect, and for the FieldAlignment it requires assumptions of the serialized representation for the 'data field', and then still has challenges on what value to set the alignment to (given it's stream driven) to effect the appropriate field padding.

It was easy enough to just create a custom IBinarySerializable implementation for it (just two fields). Although I'm not sure that this would be viable if it were a bigger field in question (i.e. not just a string, but some complex object) and it had subordinate serialization required.

jefffhaynes commented 1 year ago

Yeah, you're right of course. It doesn't update the field.

Back to your original point, doesn't the RelativePosition tell the position w.r.t. the parent?

bevanweiss commented 1 year ago

Yes. With respect to the Parent. But not within its own field. So if the field is 3 bytes long, then the Length should be reported as 3, but the field should be written out as 4 bytes in the serialized stream (it should be padded to even length, but only with respect to the current field, not it's parents, or the stream origin). The current alignment only appears to look at the stream position, I'd need it to serialize with padding, but only padding with respect to where this field started, not where the stream currently is.

jefffhaynes commented 1 year ago

Yeah I probably need to write some test code. I'm having trouble picturing it and why it's different.

bevanweiss commented 1 year ago

Yeah I probably need to write some test code. I'm having trouble picturing it and why it's different.

It's ok :) It's my challenge, so I'm the one that really should put together test cases to better illustrate the problem ;) I might push that off for a couple of weeks though. The custom Deserialization/Serialization pathway was easy enough to deal with. The only aspect that might have made it nicer would be if there were some helper functions on the stream that did the inherited endian handling.. maybe like a ReadEndiannedBytes(ref byte[], int length) / WriteEndiannedBytes(byte[], int length) thing.

I did this by just reading the bytes, then mapping to Int16/32/64 as needed, and then using the BinaryPrimitives library to swap the endian if required. Not difficult, just adds a few extra lines of code which might have been clearer if just as read/written to stream.