Automation of encoding as `list int`

qaphla commented 1 month ago

In several places, we encode various objects --- bytes, usages, labels --- as list int for the integer_encodable typeclass and the ordering it provides.

The structure of these encoding functions is quite formulaic and would be relatively easy to automate --- see, e.g., https://github.com/REPROSEC/dolev-yao-star-extrinsic/blob/452210f88ac1180710f93ebecbb6394e38b970fa/src/core/DY.Core.Bytes.Type.fst#L80-L103

A harder issue for automation is the needed injectivity lemma --- it seems like the proofs of these can easily get out of hand for the SMT solver if not done carefully. Maybe there's still some nice structure to use that is efficient. In particular, if can can generate these proofs automatically, then we can use proof structures that would be tedious to write out by hand if they improve the SMT efficiency.

See below for an example of injectivity. https://github.com/REPROSEC/dolev-yao-star-extrinsic/blob/452210f88ac1180710f93ebecbb6394e38b970fa/src/core/DY.Core.Bytes.Type.fst#L106-L156

TWal commented 1 month ago

Indeed, this could be automated, but I'm not sure it is worth spending too much efforts on this. We only need integer encodability of bytes (for cryptographic primitives that feature commutativity, such as Diffie-Hellman), and the other instances are only dependencies for that purpose (usage, label, pre_label). If we often need to write such instances, it would be useful to automate it (e.g. this is why Comparse has a tactic for message formats), but here I don't think we will need more instances.

However we will need to maintain the current instances, but I think it's not too bad: in most instances, I tried to use a lot of SMT automation to have proofs of constant size (with respect to the number of constructor of the type we encode), using for example introduce forall etc. For big types such as bytes and usage, I found out that the SMT would explode passed some number of constructors, and switched to another proof style which is of linear size, but mostly factored. I think this style is gentle on the SMT and easy to maintain (one line to add for each new constructor).

We also need to take into account that tactics also have to be maintained, as in Comparse I found out that the tactics API may change with new F* versions.

TWal commented 1 month ago

@qaphla what do you think about it? I will close the issue if you agree with me that the current state of things is not worth spending a lot of time on a meta-program.

qaphla commented 1 month ago

I would agree that it is probably not worth the time right now, but would argue that it makes sense to leave an open issue (maybe with a tag to mark it as low priority) in case it comes up again. Of course, one can still reference closed issues, but my experience is that they don't tend to get looked at again once closed.

Then again, I believe it's possible to reopen issues, so if it comes up again, that's always a possibility, in which case closing it as wontfix probably makes most sense.

TWal commented 1 month ago

Ok, let's close it for now and if we realize that the encoding proofs are difficult to maintain then we can reopen it.

REPROSEC / dolev-yao-star-extrinsic

Automation of encoding as `list int` #25