dafny-lang / dafny

Dafny is a verification-aware programming language
https://dafny.org
Other
2.92k stars 262 forks source link

Implement lazy sequence concatenation in remaining backends #2390

Open robin-aws opened 2 years ago

robin-aws commented 2 years ago

This is currently only implemented for C# and Java, but is a pretty critical optimization.

I believe it is feasible to implement this logic once in Dafny code instead of N times in each target language.

robin-aws commented 2 years ago

Implementation strategy is as described in #1775

robin-aws commented 2 years ago

Here are more details on the implementation strategy:

Sequences are currently implemented in each runtime via one or more target language classes or similar datatypes, such as ISequence, Sequence, ArraySequence, ConcatSequence etc. in C#. I have a similar common implementation in Dafny code, which will be compiled to each target language as externs with similar names. Then it is only a matter of tweaking code generation for each language to target the Dafny-generated types instead.

The remaining challenges that must be addressed are:

  1. Circular dependency: the Dafny Sequence<T> code requires a fully-built dafny tool, but the core DafnyPipeline project currently depends on the DafnyRuntime project. The solution will likely be to build any Dafny runtime code with a previously-released version of Dafny, and to check in the generated code (with a CI mechanism to regenerate it and verify it matches).
  2. Monomorphization for numeric types: the existing hand-rolled sequence implementations all at least provide versions of methods like Select and Count that accept and/or produce unbounded integer types, such as BigIntegers in C#. Several also provide multiple overloads or alternate versions that use more primitive types such as int or long. When compiling Dafny source code that uses newtype declarations that align with native types, these overloads help avoid the overhead of casting to and from such BigInteger types. The solution here will be for each compiler to track the set of native numeric types to use for indexing, and to generate these overloads for each such type.
  3. Covariance of trait type parameters: the seq<T> value type is implicitly covariant in its single type parameter: S <: T implies seq<S> <: seq<T>. Therefore the equivalent trait must be declared as Sequence<+T>, but variance symbols on traits are currently rejected by the resolver. This support will be at least partially added in order to support the Sequence<+T> type.

The features used to solve 2. and 3. will likely be enabled through CLI options marked as experimental, to avoid too much scope creep.

keyboardDrummer commented 2 years ago

Circular dependency

What's the long term setup you envision for this?

Would it be something like:

DafnyPipeline:, has the ability for runtimes to be injected, but does not reference DafnyRuntime DafnyRuntime: uses DafnyPipeline to compile .dfy files, without injecting any runtimes into DafnyPipeline DafnyDriver: references DafnyRuntime and uses it to inject runtimes into DafnyPipeline.

robin-aws commented 2 years ago

Yup, something like that, once we break the hard .csproj-level dependency of DafnyPipeline on DafnyRuntime.

We probably want to make the driver fetch the required runtime from nuget/maven/npm/etc rather than packaging them all inside Dafny itself, but have DafnyRuntime's build pass /useRuntimeLib to disable that behavior.

cpitclaudel commented 2 years ago

the core DafnyPipeline project currently depends on the DafnyRuntime project.

Isn't that dependency spurious, at the moment? Things build fine for me if I just remove it.

robin-aws commented 2 years ago

Semi-spurious, yes :) It's a packaging dependency rather than a true build dependency, because ultimately DafnyDriver needs to embed the source files so that it can emit them unless /useRuntimeLib is specified.

As I'm implementing I'm realizing it makes sense to clean up this dependency as a separate initial change first. We need to also break the assumption in some compilers that the runtime is a single source file like DafnyRuntime.cs or DafnyRuntime.js anyway, if I'm going to start generating some of that source code from Dafny target code spilling.