Parameterized modules redesign

yav commented 4 years ago

Overview

This ticket outlines our current thinking on the next iteration of Cryptol's parameterized modules system.

The main differences to the current design is:

~~eliminate the "backtick" imports~~ "backtick" imports become syntactic sugar for F { _ } where _ indicates that when instantiating a module, the corresponding parameters should be added to declartions, as in backtick imorts
change the way parameterized modules are instantiated
allow nested modules
add syntactic sugar for a common use pattern.
~~Make the experimental secrete newtype feature obsolete~~ After some deliberation we decided to make newtype part of the language instead, as it seem quite useful.

As a result, we end up with three flavours of modules: normal modules, parameterized modules, and module instantiations. The three types of modules may be arbitrarily nested with semantics described later.

Module Instantiation

A module instantiations is a module which is derived by instantiating a parameterized module. Here is an example:

module F where
  parameter
    type N : #
    x      : [N]

  y : [N]
  y = x + 1

module P where
  type N = 8

  x : [N]
  x = 2

module A = F P

Here we have a parameterized module F, a normal module, P, which provides definitions for F's parameters, and a module instantiation A which defines a new module A as an instance of F with parameters provided by P

Questions

Is it OK if P is a parameterized module? If yes, then perhaps the instantiation would be automatically parameterized by P's parameters.
Does P need to define all of Fs parameters or do we allow partial applications?

I think we should allow 1 and disallow 2, as it seems error prone.

Generative or Applicative

Module instantiation is generative which means that each module instantiation defines a new module, even if it is using the same parameterized module with the same parameters.

Cryptol only provides a way to name expressions and types, but there is no way to create new abstract definitions (except for the experimental newtype feature which we now consider obsolete). As a result, there is no significant difference between applicative or generative behavior.

The difference does show up when we resolve names though, as illustrated by this example:

module A = F P

module B = F P

module C where
  import A
  import B

  z = y -- error `y` is ambiguous because `A` and `B` are different modules

Nested Modules

Modules may be nested within each other, as illustrated by the following example:

module Ex1 where

  x = 0x16

  module B where
    y = x + 1

  import B

  z = y

Important things to note:

All names available in outer modules are also in scope in inner modules.
Inner module declarations may shadow names from outer modules.
Inner modules may be imported only from their immediate enclosing module
All recursion must be within a single module. In the above example, x may not depend on y as that would lead to cross module recursion.
import declarations may be anywhere in a module, not just at the top.
Inner modules shadow external modules, no matter where the import is located in the file.

Nested modules make it easy to instantiate parameterized modules. Here is an example:

module Ex2 where

  module Q where
    type N = 16

    x : [N]
    x = 2

  module B = F Q

  import B

  ex2 = y + y   -- this is `y` from the instantiation of `F`

Syntactic Sugar

We also introduce some syntactic sugar for a few common patterns.

Convenient import of parameterized modules:

  import F as Q where
    (decls)

is desugared to:

  module Anon1 where
    (decls)

  module Anon2 = F Anon1

  import Anon2 as Q

As usual, the as Q part is optional.

Convenient import of a local module:

  import module A where
    (decls)

is syntactic sugar for:

  module Anon where
    (decls)

  import Anon as A

yav commented 4 years ago

Just a note so we don't forget: we may want to also add support for some sort of signatures, which would allow us to name a group of parameters, and reuse the abstraction.

One way to do that might be to think of a module that declares only parameters and type synonyms as a "signature". To use a signature one could simple import it. We would also have to modify the instantiation of modules to allow for multiple named instantiations. Here is an example:

-- Signature
module Sig where parameter x : Integer

-- Functor parameterize by a signature
module Functor where
  import Sig as A
  import Sig as B

   z = A.x + B.x

module I1 where x = 2
module I2 where x = 3

-- Instantiating a functor
module Apply = Functor { A = I1, B = I2 }

brianhuffman commented 4 years ago

I want to make sure that the design we come up with actually addresses #585: We want to be able to have a top-level parameterized module that passes along a subset of its parameters to another parameterized module that it builds on. To make things concrete, let's say that we have a parameterized module F with parameters n and x0 that we want to import from G, which has a superset of those parameters.

Using the design described above, here's how that would look:

module F where
  parameter
    type n : #
    x0 : [n]

module G where
  parameter
    type n : #
    x0 : [n]
    f : [n] -> [n]

import F
  where
    type n = n
    x0 : [n]
    x0 = x0

<more declarations>

This seems mostly reasonable, but the name scoping in the where clause of the import F declaration is confusing. The type n = n declaration looks like it's recursive, but we actually want the n on the right-hand side to refer to the parameter n of module G, not the local n. Same for the declaration x0 = x0: We don't want a recursive definition, we want to refer to the parameter x0 of the outer module. As for the type signature x0 : [n], I'm not sure which n I really want this to refer to. They happen to be the same in this case, but imagine if G had two parameters m and n, and you wanted to import F with type m = n and type n = m; that would be very confusing.

As the names of parameters are a visible part of the API of a parameterized module, we won't always be able to avoid name clashes like this. So I think we need a way to distinguish module parameters from similarly named declarations in inner-modules.

weaversa commented 4 years ago

A solution that immediately comes to mind, and one that I'd actually be happy with is making the as clause on the import be necessary. That would stop naming collisions and force specs to be clearer. I was just reading some specs and noticed myself wondering "where did that function even come from?" Though, maybe this would seriously ugly up @yav's floating point module.

brianhuffman commented 4 years ago

Requiring as on all imports wouldn't fix the problem, as the name clash in my example doesn't involve any imported module; rather it is a clash between a module's parameters and an inner module's declarations. Perhaps it is easier to see if I expand the syntactic sugar:

module G where
  parameter
    type n : #
    x0 : [n]
    f : [n] -> [n]

module Anon where
  type n = n
  x0 : [n]
  x0 = x0

module Anon2 = F Anon

import Anon2

Qualifying the import of Anon2 doesn't help, because the name clash is between the parameters of the outer module G and the contents of the inner module Anon (which is used as the argument to parameterized module F).

The solution I had in mind was to have a module name for the argument to module G, so you could refer to the parameters with qualified names. Something like this:

module G(A) ... where

module B where
  type n = A::n
  x0 : [n]
  x0 = A::x0

import F(B) as C

yav commented 4 years ago

If we had signatures as in the comment I wrote above, we could do it like this, I think:

-- signature
module A where
  parameter
    type n : #
    x0 : [n]
    f : [n] -> [n]

module G where
  import A as A    -- this adds the parameters to G

  module B where
    type n = A::n
    x0 : [n]
    x0 = A::x0

  import F(B)

I am not too sure how I feel about the notation this uses, and if we do use this notation we probably do want to have syntactic sugar for defining and importing a signature at the same type as this seems quite common.

brianhuffman commented 4 years ago

Here's another syntax idea: We could specify a module name to qualify a parameterized module's arguments by attaching the name to the parameter block:

module G where
  parameter A where
    type n : #
    x0 : [n]
    f : [n] -> [n]

module B where
  type n = A::n
  x0 : [n]
  x0 = A::x0

import F(B) as C

In the body of G, the names of types and values declared in the parameter A block would be in scope with the A:: qualifier. We would also have the module identifier A in scope, which could itself be used directly as the argument to another imported parameterized module (e.g. import H(A) as D).

The form with an unnamed parameter block

  parameter
    <decls>

could be treated as syntactic sugar for

  parameter Anon where
    <decls>
  import Anon

With this syntax, I think we could get by just fine without needing named signatures.

brianhuffman commented 4 years ago

Something else I was thinking about: People might want to be able to chain together two or more parameterized modules in a single import declaration, e.g.

module A where <decls>

import G(F(A)) as B

This would use the instantiated module F(A) as the input parameters for the other parameterized module G. We could either allow compound module expressions like this, or alternatively we could require users to name each intermediate module like the following:

module A where <decls>

module FA = F A
import G FA as B

brianhuffman commented 4 years ago

This reminds me of another point: Is there (or does there need to be) any useful difference between the declarations import F(X) as A and module A = F(X)? If you're going to define a local module name A, it seems like it would never hurt to also bring qualified names A::ident into scope; conversely, if you're going to do a qualified import like import Foo as A, this might as well bring the module identifier A into scope. So maybe we don't need both syntactic forms?

weaversa commented 4 years ago

This reminds me of another point: Is there (or does there need to be) any useful difference between the declarations import F(X) as A and module A = F(X)? If you're going to define a local module name A, it seems like it would never hurt to also bring qualified names A::ident into scope; conversely, if you're going to do a qualified import like import Foo as A, this might as well bring the module identifier A into scope. So maybe we don't need both syntactic forms?

If possible, I suggest supporting import F(X) as A and removing module A = F(X).

msaaltink commented 4 years ago

While you are redesigning parameterized modules, a feature that might be useful is the ability to have a type constraint as a parameter. This comes up in examples like HMAC and HKDF that are parameterized by a hash function. We might like to write something like

module M where

    parameter
        type hash_len: #
        type constraint Hashable: # -> Bit
        hash: {n} (Hashable n) => [n][8] -> [hashlen][8]

    f: {n} (Hashable (n+8)) => [n][8] -> [hashlen][8]
    f x = hash ("abcdefgh" # x)

and only specify the definition of Hashable when we instantiate, as it will indeed depend on which hash function we pick. SHA256, SHA384, SHA512, and the various versions of SHA-3 all have different constraints.

weaversa commented 4 years ago

I absolutely second Mark’s idea. I’ve had trouble using hash functions as parameters for this same reason.

atomb commented 3 years ago

This is related to issues #546, #555, #556, #865, #998, #1042, #1054, #585, #689, #1237, and #377.

GaloisInc / cryptol