dgibson / dtc

Device Tree Compiler
218 stars 130 forks source link

0xFFFFFFFF00000000 parsed as 32 bit #74

Closed CTerasa closed 1 year ago

CTerasa commented 1 year ago

Perhaps I am missing something, but uint64_t in the following range 0xFFFFFFFF00000000-0xFFFFFFFFFFFFFFFF are treated as uint32_t for property array data.

Example:

/dts-v1/;
/ { 
    /* 0xFFFFFFFF00000000 is not an uint32_t, or is it? */
    property1 = <0xFFFFFFFF00000000>;
    /* 0xFFFFFFFFFFFFFFFF is also not an uint32_t, or is it? */
    property2 = <0xFFFFFFFFFFFFFFFF>;
    /* 0xFFFFFFFEFFFFFFFF is definitely not an uint32_t! */
    /* property3 = <(0xFFFFFFFF00000000-1)>; */
};

The dtc -O dts output of this snippet is:

/dts-v1/;

/ {
    property1 = <0x00>;
    property2 = <0xffffffff>;
};

My expectation would be, that all uint64_t values above 0xFFFFFFFF are illegal 32bit data values. This of course includes the 64 bit values with all ones in the upper 32 bits. At least 0xFFFFFFFEFFFFFFFF is invalid. Is there a rationale behind this or is it a bug?

We can then have a "full" 32 bit signed integer, with the sign bit being all upper 32 bits set or unset? ;-)

dgibson commented 1 year ago

So, the current behaviour is pretty straightforward, if perhaps not entirely obvious:

All integer expressions (including literals) are evaluated assuming 64-bit unsigned, wrapping, arithmetic. The final result is then truncated to the size of the element it's going into. That's usually 32 bits but can be 8, 16 or 64 bits when using the /bits/ directive.

I'm not really clear on what behaviour you're suggesting instead, but the alternative options I can think of involve some severe practical drawbacks:

CTerasa commented 1 year ago

For me an approach like the following would be less unexpected:

This would mean for 32 bits (0x100000000) is valid, while 0x100000000 is not.

Some random fun with the current arithmetic:

/dts-v1/;
/ {
        property= <
        (-0xFFFFFFFF-1) /* is 0 */
        (-0xFFFFFFFF-2) /* is invalid */
        ((-0xFFFFFFFF-1)*2) /* seems to be 0*2 but is obviously not */
        >;
};
dgibson commented 1 year ago

For me an approach like the following would be less unexpected:

  • Literals that are not in bounds of the array field type are invalid. So 0xFFFFFFFFFFFFFFFF is invalid for 32 bits

Treating literals differently from other arithmetic also seems oddly inconsistent.

  • Arithmetic is truncated to the array field type length after calculation. (-1) is valid for 32 bits.

That's what we do now.

This would mean for 32 bits (0x100000000) is valid, while 0x100000000 is not.

Until I read this I misunderstood your first suggestion above. This really highlights the inconsistency to me.

Some random fun with the current arithmetic:

/dts-v1/;
/ {
        property= <
      (-0xFFFFFFFF-1) /* is 0 */
      (-0xFFFFFFFF-2) /* is invalid */
      ((-0xFFFFFFFF-1)*2) /* seems to be 0*2 but is obviously not */
      >;
};

When I tried this, I realized I was mistaken in my previous comment. I'd forgotten that when we truncate down to the cell size, we give an error if the truncated bits are anything other than all 0s or all 1s. I guess that's a tradeoff - we want to allow both all 0s and all 1s to allow for both signed and unsigned values (both common cases). Not allowing anything else is an attempt to catch if we're truncating important information. In the second two examples here the top bits are 0xfffffffe, hence the errors.

I am now wondering if simply truncating without errors might be a better idea, given the confusion it's caused you,

CTerasa commented 1 year ago

we give an error if the truncated bits are anything other than all 0s or all 1s.

Yes, exactly what I see and wondered about. I call this the "all 0s or all 1s" rule now.

I am now wondering if simply truncating without errors might be a better idea, given the confusion it's caused you,

Yes, this would be another good solution. My expectations were that there are good reasons not to truncate at cell size but throw an error, so I did not want to propose this.

Perhaps we could always truncate but throw an out-of-bounds warning if necessary. (One can argue and I am undecided if negative values are out-of-bounds. I agree, negative values are valid use cases, but still they are no valid unsigned ints) Or perhaps only warn if the "all 0s or all 1s" rule does not apply, i.e. in case of wrap around effects during arithmetic or "truncating important information". But would that be of any help?

At least always truncating would better match to my expectations.

So some options:

  1. I deal with it and reconsider my expectations an honor the "all 0s or all 1s" rule; no changes necessary.
  2. Always silently truncate.
  3. Always truncate and throw an out-of-bounds warning for all values that are out-of-bounds. This would include "negative" values.
  4. Always truncate and throw an out-of-bounds warning when the "all 0s or all 1s" rule does not apply, instead of erroring out.
  5. Un-favored: Handle arithmetic and literals differently (internally).

I am totally fine with 1. but I'd be in favor to 4. as it keeps current use cases valid without warnings, leaves the option to spot possible errors and seems a bit more consistent to me.

dgibson commented 1 year ago

Sorry I've taken so long to reply again. I think your option (4) is a good one, at least in the short term. Basically we keep the same logic, but demote the error to a warning.

I just pushed a change which should implement this.