idris-lang / Idris2

A purely functional programming language with first class types
https://idris-lang.org/
Other
2.52k stars 375 forks source link

Partial evaluator anomaly #1965

Open danielkroeni opened 3 years ago

danielkroeni commented 3 years ago

The %spec annotation leads to surprising effects.

Steps to Reproduce

Compile and run the given program: https://gist.github.com/danielkroeni/6d5386f8a8ad7be15713725f4f9bf01a Observe that the output is State: 6.

Then in the same program enable the %spec ma,mb annotation on (>>). Compile and run again. Now the output is State: 4.

Expected Behavior

I would expect that the %spec annotation to have no observable effect. (But I may be wrong.)

Observed Behavior

There is an observable effect, the computed results differ and seem to be wrong when %spec ma,mb is enabled.

Some more anomalies

  1. Let %spec ma,mb enabled but add an additional (from my beginner point of view) totally unrelated import Data.List to the top of the file and it produces again the correct result. This may have something to do with "things behave differently if name needs to be disambiguated" (e.g. https://github.com/idris-lang/Idris2/issues/1896).

  2. And here is an even stranger effect: Given the code here in two file: https://gist.github.com/danielkroeni/d082fd3795345f77de740b08902f0e61 Compile and run problem.idr and it produces the expected result. Then change the name of the function xxxx to xxx in both files and it generates a wrong result. That's confusing.

Version

> idris2 --version
Idris 2, version 0.5.1-a9ccf4db4
ziman commented 3 years ago

I compiled the example with and without %spec, using --dumpcases:

$ diff dumpcases-spec.idr dumpcases-nospec.idr 
0a1
> {csegen:17} = []: (%lam {eta:0} (Main.add [1, !{eta:0}]))
6,7c7,9
< Main.comp = [{ext:0}]: (_PE.PE_>>_0 [!{ext:0}])
< _PE.PE_>>_0 = [{ext:0}]: (%con [cons] Builtin.MkPair Just 0 [(+Int [(+Int [(+Int [(+Int [!{ext:0}, 1]), 1]), 1]), 1]), 0])
---
> Main.comp = [{ext:0}]: (Main.>> [({csegen:17} []), (%lam {eta:0} (Main.>> [({csegen:17} []), (%lam {eta:1} (Main.>> [({csegen:17} []), (%lam {eta:2} (Main.>> [({csegen:17} []), (%lam {eta:3} (Main.>> [({csegen:17} []), ({csegen:17} []), !{eta:3}])), !{eta:2}])), !{eta:1}])), !{eta:0}])), !{ext:0}])
> Main.add = [{arg:0}, {ext:0}]: (%con [cons] Builtin.MkPair Just 0 [(+Int [!{ext:0}, !{arg:0}]), 0])
> Main.>> = [{arg:1}, {arg:2}, {ext:0}]: (%case (!{arg:1} [!{ext:0}]) [(%concase [cons] Builtin.MkPair Just 0 [{e:2}, {e:3}] (!{arg:2} [!{e:2}]))] Nothing)

Here's what the xxx vs xxxx example looks like.

$ diff dumpcases_xxxx.idr dumpcases_xxx.idr
5,6c5,6
< Main.comp = [{ext:0}]: (_PE.PE_>>_41f9c421432e45e8 [!{ext:0}])
< _PE.PE_>>_41f9c421432e45e8 = [{ext:0}]: (%con [cons] Builtin.MkPair Just 0 [(+Int [(+Int [(+Int [(+Int [(+Int [(+Int [!{ext:0}, 1]), 1]), 1]), 1]), 1]), 1]), 0])
---
> Main.comp = [{ext:0}]: (_PE.PE_>>_0 [!{ext:0}])
> _PE.PE_>>_0 = [{ext:0}]: (%con [cons] Builtin.MkPair Just 0 [(+Int [(+Int [(+Int [(+Int [!{ext:0}, 1]), 1]), 1]), 1]), 0])

I find the zero hash quite suspicious.

danielkroeni commented 3 years ago

Is it the hash computed here? https://github.com/idris-lang/Idris2/blob/3536f8dab58e1a49745f6129ef16a5db29140c0b/src/TTImp/PartialEval.idr#L364

danielkroeni commented 3 years ago

I just looked at Core.Hash, read the comment:

-- As you can probably tell, I know almost nothing about writing good hash -- functions, so better implementations will be very welcome...

and further down at https://github.com/idris-lang/Idris2/blob/main/src/Core/Hash.idr#L27 I thought this 33 looks suspicious - this needs to be a prime number.

So I changed it to 31. Then I found that 33 is the "original" number from Daniel J. Bernstein's popular times 33 hash function (https://gist.github.com/hmic/1676398). I even know less about hash functions :)

When I use 31 then the result of executing with xxx is now State: 3 With the original 33 it was State: 4

So this hash function clearly influences what is going on here.

danielkroeni commented 3 years ago

I changed the hash function to the java standard:

  hash = hashWithSalt 7
  hashWithSalt h i = h * 31 + hash i

and replaced some other 33 factors with 31 and it solves the problems in the gist. BUT when more add 1 cases are added, the problem reappears. In my original code with much more invocations of >> the problem persists.

I guess we need a better hash function. @ziman thanks for spotting the problem!

Z-snails commented 3 years ago

I've just implemented SipHash24, which is rust's default and has good collision resistence. It's only 3 times slower than the current implementation. Should I replace every hashable instance with this, or only in the partial evaluator.

danielkroeni commented 3 years ago

Thank you for working on it! This sounds like a great improvement. I am not sure about the process. Who can decide about this? Maybe you should ask on discord?

danielkroeni commented 3 years ago

I think we should also either change the algorithm to handle hash collisions or use a collision resistant hash function. (Which would not result in an Int but something larger.)

danielkroeni commented 3 years ago

For the record: The obvious bug was fixed in https://github.com/idris-lang/Idris2/pull/2022. Nevertheless, a proper fix (as noted by @gallais) would need to handle hash collisions.