querycert / qcert

Compilation and Verification of Data-Centric Languages
https://querycert.github.io/
Apache License 2.0
56 stars 9 forks source link

InstanceOf operator #95

Open jeromesimeon opened 6 years ago

jeromesimeon commented 6 years ago

Wouldn't it be nice to have an instance of operator (for a given statically known type)? Something that might look like d instanceof t If I'm correct, this could be implemented as:

  1. infer the type t' of data d (for a given normalized data) -- see in TDataInfer
  2. check whether t' is a subtype of t -- see in RSubtype

I believe this is a sound procedure thanks to the lovely theorem in TDataInfer:

  Theorem infer_data_type_least {d τ₁ τ₂} : 
    infer_data_type d = Some τ₁ ->
    d ▹τ₂ ->
    τ₁ <: τ₂.
shinnar commented 6 years ago

I am fine with this, if someone (@jeromesimeon?) is interested in implementing it. In theory, it is reasonable, and can be implemented in Coq along the lines you suggest.

In practice, note that this complicates things like extraction and (de)serialization and such. I believe this is the first operator/language feature that embeds a type, other then tDNNRC. As a result, doing this may require some enhancement to (de)serialization and runtime support.

It will also require that operators be parameterized over a foreign type and a brand relation (since these are both needed in order to construct an rtype). This will ripple changes throughout the code for languages, the compiler frontend modules, ocaml...

Note that you will need to implement this type test in the runtimes for our current backends (java, javascript). This, btw, forces the runtime to discriminate/tag data with types. Currently, since the code is known to be well-typed, the runtime could in theory conflate" values of different types. Adding this operator neccesitates RTTI/tagging. I am ok with this, but want to be clear.

I assume that the instanceof operator returns an either, similar to the branding casting operator. In the success branch it will provide the given data, but with an enriched type (since we don't have flow-sensitive typing, returning a boolean is not nearly as useful).

Also, it would be nice to add optimizations for this (although that could become a separate issue). If we can statically determine that the check is true/false, we can optimize the check and replace with the appropriate constant. We should then be able to optimize the likely subsequent Either match against a constant (this should already work, but should be verified).

jeromesimeon commented 6 years ago

Thanks for all the feedback. Quite a bit more work than I naively thought. I'll keep it in mind, and may sign up for it at some point.