Open paulstansifer opened 9 years ago
Implement an alpha-respecting hash function; figuring out where to use it comes later. Look at equal-hash-code
(and look at the second one)
@paulstansifer: Have there been any progress on this issue? I need this for my own work and I am happy to fix this bug if there isn't a fix already.
It seems that alpha-respecting hash function already exists in binding-forms.rkt
: the α-equal-hash-code
function. Replacing snip-cache
hash with custom hash based on the mentioned function should do the trick, I think.
Please do! Making α-equal-hash-code
was easy, but I never figured out what to do with it.
I am slightly confused: I have defined a language L
that contains lambda calculus. I have tested α-equal?
function:
> (α-equal? (compiled-lang-binding-table L) match-pattern (term (λ x x)) (term (λ y y)))
#t
On the other hand, α-equal-hash-code
returns different values for alpha-equivalent terms:
> (α-equal-hash-code (compiled-lang-binding-table L) match-pattern (term (λ x x)))
122630835892628
> (α-equal-hash-code (compiled-lang-binding-table L) match-pattern (term (λ y y)))
122635447880559
Because of that, custom hash is not working as expected:
> (define h (make-custom-hash (lambda (x y) (α-equal? (compiled-lang-binding-table L) match-pattern x y))))
> (dict-count h)
0
> (dict-set! h (term (λ x x)) 1)
> (dict-count h)
1
> (dict-ref h (term (λ x x)))
1
> (dict-set! h (term (λ y y)) 2)
> (dict-count h)
1
> (dict-ref h (term (λ x x)))
2
> (define h (make-custom-hash (lambda (x y) (α-equal? (compiled-lang-binding-table L) match-pattern x y)) (lambda (x) (α-equal-hash-code (compiled-lang-binding-table L) match-pattern x))))
> (dict-set! h (term (λ x x)) 1)
> (dict-count h)
1
> (dict-set! h (term (λ y y)) 2)
> (dict-count h)
2 ; should not happen!
Is this intended? Am I doing something wrong? For the reference, definition of the L
language can be found in my public repository: https://bitbucket.org/sznurek/lambda-mu-redex/src/e8d14810119e147199947cfc9cab34015b505c08/lambda-mu.rkt?fileviewer=file-view-default#lambda-mu.rkt-4
That sounds like a bug in my hashing code. I'll take a look at it.
I tried to reproduce your situation exactly, but I got the same hash code for both: https://gist.github.com/paulstansifer/00c1a0e3de08a16dbe93
When I run it, I get
#t
117428840964850
117428840964850
Could you send me a complete file to look at?
Another thing that might be helpful would be to modify binding-forms.rkt
to export canonicalize
and report what the result of calling canonicalize
instead of α-equal-hash-code
is (it takes the same arguments in the same order).
I have played with this code inside DrRacket's REPL. I have run (CTRL-R) the code with language definition, then executed the following:
> (require redex/private/matcher)
> (require redex/private/binding-forms-compiler)
> (require redex/private/term-fn)
> (require redex/private/lang-struct)
> (require redex/private/binding-forms)
> (α-equal-hash-code (compiled-lang-binding-table L) match-pattern (term (λ x x)))
119312564214743
> (α-equal-hash-code (compiled-lang-binding-table L) match-pattern (term (λ y y)))
119365168470767
Anyway, when I run the code you have attached from the shell (racket ~/Code/reproduction-attempt.rkt
) I get correct results. Thank you!
This is so weird. In DrRacket (with binding-forms.rkt modified to export canonicalize
and to print the canonicalization when calculating hash codes):
> (α-equal-hash-code (compiled-lang-binding-table L) match-pattern (term (λ x x)))
==(λ |1| |1|)==
120218973210905
> (α-equal-hash-code (compiled-lang-binding-table L) match-pattern (term (λ y y)))
==(λ |1| |1|)==
120242422939934
...so it looks like equal-hash-code
is looking at something "invisible" in those canonicalizations...
> (define cx (canonicalize (compiled-lang-binding-table L) match-pattern (term (λ x x))))
> (define cy (canonicalize (compiled-lang-binding-table L) match-pattern (term (λ y y))))
> cx
'(λ |1| |1|)
> cy
'(λ |1| |1|)
> (equal? cx cy)
#t
> (equal-hash-code cx)
120248800697966
> (equal-hash-code cy)
120248800697966
...but storing the canonicalization seems to make the problem go away (!?). This makes no sense, given that α-equal-hash-code
does nothing but call equal-hash-code
on the output of canonicalize
.
Maybe there's a bug in equal-hash-code
?
Okay, this is definitely a problem in equal-hash-code
:
> (equal-hash-code (string->symbol "1"))
1165559
> (equal-hash-code (string->symbol "1"))
1165693
Reported a bug against Racket: http://bugs.racket-lang.org/query/?cmd=view&pr=15262
Looks like whether this is a bug in Racket is debatable. I found an unrelated flaw in canonicalization, so I fixed it and this problem with one stroke; should be visible in the latest master (156bbd1).
I have created pull request with my proposed solution - happy to get feedback on this!
Robby says: "One thing to also look into for Redex: in addition to an alpha-sensitive equality, we also need an alpha-sensitive hashing. I think it's straightforward: just normalize and then call the equal-based hash function that's built into Redex.
We'll need this for traces (the way it collapses different paths to the same term is based on hashing)."