APrioriInvestments / typed_python

An llvm-based framework for generating and calling into high-performance native code from Python.
Apache License 2.0
197 stars 8 forks source link

TP's hash function is not the same as python's #406

Open braxtonmckee opened 1 year ago

braxtonmckee commented 1 year ago

Right now, the hash function we use inside of Dict and Set is not the same as the python one. This is a problem because if you make a Dict(object, T) then it'll use a different hash for strings and other builtins that if you use Dict(str, T), which can lead to subtle issues.

Ideally, we'd exactly match python's hash function. To do this we would need to change our hash function to return int64 (or maybe it's bigger? need to check) instead of int32, and ensure that we implement the same hashing logic as python itself in both the interpreter and the compiler.

That way, if the compiler knows an object as 'object' it can just call the builtin 'hash' and know it will get results that are consistent with the hash function being used inside of TP.