edwinb / Idris2-boot

A dependently typed programming language, a successor to Idris
https://idris-lang.org/
Other
903 stars 58 forks source link

Fix unmarshalling of Integer values in TTC (Fixes: #345) #367

Closed cypheon closed 4 years ago

cypheon commented 4 years ago

In fromLimbs, the read values arrive as a list of (32-bit) Ints. If any of the limbs has its MSB set, it is interpreted as a negative number and corrupts the result.

This commit fixes this by explicitly casting the incoming value to a Bits32 (i.e. unsigned) and then to an Integer. This fixes #345.

The stored/loaded Integer is always non-negative, because the sign is stored separately from the limbs.

An alternative solution would be to only store 31 bits per limb (no explicit casting necessary), but this seems rather inefficient.

This PR also also updates the expected test results, as the previous values where not arithmetically correct. The updated results are hand-checked with both the Chez and Python REPL.

diakopter commented 4 years ago

cool!

gallais commented 4 years ago

Thanks for looking into this.

This fix feels wrong. If the input is not a List Int then we should give it a different type (e.g. List UInt) and have the cast instance for that type be the appropriate cast.

So basically the same underlying fix but with different types.

cypheon commented 4 years ago

@gallais, thanks for the feedback!

I agree, that this explicit type casting makes the code hard to read and also makes assumptions about the underlying storage, that are not visible in the types.

So the solutions I can think of (in order of my preference):

  1. Store the Integer as a List Bits8 instead of a List Int -> small change and readable code, (maybe performance issues though, as we do more calculations during {to,from}Limbs)
  2. Store the Integer as a List Bits32, but since there is no setBits32 in Buffer I'd need to either 2a. Implement TTC Bits32 as a store/load of 4 separate Bits8 (probably with the same performance implications as "1.") 2b. Implement TTC Bits32 in terms of explicit casts to/from Int which would basically do what it's doing now but a tiny bit more explicit. Though this feels like just hiding/moving the awkward casts to a different place.

Which of those do you think fits best? Maybe I'm overthinking the performance issues, but a assume (de)serialization of Integers happens quite often.

gallais commented 4 years ago

Implement TTC Bits32 in terms of explicit casts to/from Int which would basically do what it's doing now but a tiny bit more explicit. Though this feels like just hiding/moving the awkward casts to a different place.

I think it's better to have this kind of magic in low level core libraries that expose a type-safe interface so I would go with this option. Let's maybe wait for @edwinb's take on this?

edwinb commented 4 years ago

The performance issues are probably not a huge worry, because they'd only come into play when serialising an integer bigger than 32 bits. This function doesn't show up prominently in the profile.

To be honest, I wouldn't overthink this one anyway, since it'll need to be a bit different when we port everything to Idris 2, and that is getting closer to being possible with the recent improvements in the runtime.

One thing we definitely need, for this and other bit twiddling related things, is support in Idris 2 for integer types with specific bit widths. I think @evertedsphere is on the case here.

cypheon commented 4 years ago

I changed toLimbs and fromLimbs to produce/consume a List Bits8 and rebased onto current master. This should work correctly and I think the code is quite understandable as well.

edwinb commented 4 years ago

Interestingly, the only test failure I have in Idris2-SH is because of exactly the bug this is fixing! I don't know what's up with the CI because the failure is unrelated. I'll merge this shortly anyway. Thanks!

edwinb commented 4 years ago

Oh yes, of course, the failure is the test where I accidentally left my home directory hard coded. There's a fix for that - will tidy up. Thanks!