runtimeverification / k

K Framework Tools 7.0
BSD 3-Clause "New" or "Revised" License
430 stars 141 forks source link

[KIP] - Add a new hook for hashing K terms #2568

Open ChristianoBraga opened 2 years ago

ChristianoBraga commented 2 years ago

Motivation

KPlutus - https://github.com/runtimeverification/plutus-core-semantics

Not that I am aware of.

It will remove a potential bootleneck at KPlutus that now uses #unparseKORE from K reflection and Sha3_256 functions to hash KPlutus values.

This feature is already present in the LLVM backend. It would only need to be exposed to the user, as far as I know.

Example K Code

Function hashK for instance would receive any K term and return an integer. It could perhaps be declared as

syntax {Sort} Int ::= #hashK(Sort)

I am not sure about the declaration. It was inspired by the declaration of #unparseKORE.

Documentation

Function #hashK produces an integer for any given K term. It may help, for instance, on storing terms on a hash table to represent a heap in a programming language.

Potential Alternatives/Work-arounds

In the KPlutus project we use #unparseKORE and Sha3_256 to implement such a function. Another alternative could be to implement a function that transforms any given KPlutus term into a string and to implement a hash function, all in K.

Testing Approach

This function would could be easily tested in KPlutus simply by rewriting #uplcHash in module uplc-hash.md accordingly or simply replacing the call to #uplcHash by #hashK.

radumereuta commented 2 years ago

Here is the start of the discussion on slack. It has a lot more details about what is needed: https://runtimeverification.slack.com/archives/C7E701MFG/p1651087319515199 Probably needed for both Haskell and LLVM backends.

radumereuta commented 2 years ago

Please investigate further what @dwightguth is suggesting in the slack channel. There might be issues ensuring we have the same implementation in the Haskell and LLVM backends.

SchmErik commented 2 years ago

This issue is an optimization to a solution for creating an environment that maps variable names to terms.

If we create a map from variable names to the terms themselves as @dwightguth suggested, the heap could look like this. Note n, m, and p are variable names and v_1 is a term.

<heap>
  n |-> v_1
  m |-> v_1
  p |-> v_1
</heap>

This blows up the space consumption during program execution because v_1 can be repeated in the heap and some of these terms are massive.

To address this issue, our solution is to hash the term and create a map between the hash and the term itself and map the variable to the hash so the heap above looks like this:

<heap>
  n |-> hash(v_1)
  m |-> hash(v_1)
  p |-> hash(v_1)
</heap>

and there's a separate environment that maps hash(v_1) |-> v_1. Our current approach is to use #unparseKore and Sha3_256 to hash the terms however, this would be faster if there was a hook for this.

From our discussion today, we need to have this hash hook return the same values for haskell and llvm backends.

Urgency

We do not need this at the moment because we aren't experiencing any performance issues arising from our current hashing function. If this becomes an issue, we would like to prioritize this. If anyone has a better solution, please feel free to chime in!

dwightguth commented 1 year ago

@ChristianoBraga is this still useful to you?

ChristianoBraga commented 1 year ago

Yes it’s still useful.

Em qui., 20 de out. de 2022 às 12:11, Dwight Guth @.***> escreveu:

@ChristianoBraga https://github.com/ChristianoBraga is this still useful to you?

— Reply to this email directly, view it on GitHub https://github.com/runtimeverification/k/issues/2568#issuecomment-1285712069, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFNNCEKTBQKPB3B7OZS4UDTWEFOINANCNFSM5UT2NDJQ . You are receiving this because you were assigned.Message ID: @.***>