Wasm-DSL / spectec

Wasm SpecTec specification tools
https://wasm-dsl.github.io/spectec/
Other
26 stars 9 forks source link

Readability of generated sources #92

Closed tlively closed 5 months ago

tlively commented 5 months ago

Hi, everyone πŸ‘‹

I just finished reading https://people.mpi-sws.org/~rossberg/papers/spectec1.pdf and I have some questions about the readability of the reST and OCaml sources SpecTec generates. For context, I am trying to get a better understanding of what the proposal authoring and spec maintenance experience will be using SpecTec so I can have an informed opinion about adopting it for the upstream Wasm spec.

A strong argument in favor of adopting SpecTec is that it is a low-risk, reversible decision because we can just snapshot the generated sources and go back to manual development if SpecTec for some reason cannot handle our future needs. How easy would that be, though?

  1. Am I correct that the generated sources are not expected to be checked in to the upstream spec repo, but rather treated as ephemeral artifacts of the build process? Normally it would be good to avoid checking in generated sources, but in this case, having the generated sources in source control would help ensure that they are somewhat readable and would de-risk returning to manual editing if that ever becomes necessary. Are there other arguments for or against checking in the generated sources?

  2. The current spec uses the macros from macros.def to improve readability of the source. Does SpecTec use these macros as well in its generated source? If not, would it be possible and desirable to make it use the macros?

  3. I know that SpecTec generates a meta-interpreter, but the upstream spec currently has a non-meta interpreter. I imagine that if we ever have to go back to manual development, starting with the generated meta-interpreter would be more difficult than if we had a non-meta interpreter. Is that right? If so, would it be possible to make the SpecTec interpreter backend a meta-compiler that compiles the AL to a non-meta interpreter rather than emitting a meta-interpreter?

I'm also curious about how valuable other folks think the readability of the generated sources is.

rossberg commented 5 months ago

Hi Thomas, good questions.

  1. Am I correct that the generated sources are not expected to be checked in to the upstream spec repo, but rather treated as ephemeral artifacts of the build process? Normally it would be good to avoid checking in generated sources, but in this case, having the generated sources in source control would help ensure that they are somewhat readable and would de-risk returning to manual editing if that ever becomes necessary. Are there other arguments for or against checking in the generated sources?

Right, currently they are just ephemeral, but of course it would be easy to check them in. Not sure what advantage that would have, though. I don't think the ability to potentially snapshot them in the future depends on that?

  1. The current spec uses the macros from macros.def to improve readability of the source. Does SpecTec use these macros as well in its generated source? If not, would it be possible and desirable to make it use the macros?

SpecTec can generate output with or without invoking (externally defined) macros. To handle cross-references, we will use the existing macros. They are also important for space reasons (cf. the MathML version of the spec taking excessive space, one big reason for which is that it can't use macros and hence has to massively replicate all the cross-references).

That said, I'd also like to generate pieces of the macros file eventually, but we're not there yet.

  1. I know that SpecTec generates a meta-interpreter, but the upstream spec currently has a non-meta interpreter. I imagine that if we ever have to go back to manual development, starting with the generated meta-interpreter would be more difficult than if we had a non-meta interpreter. Is that right? If so, would it be possible to make the SpecTec interpreter backend a meta-compiler that compiles the AL to a non-meta interpreter rather than emitting a meta-interpreter?

Ah, the meta interpreter, by nature of being an interpreter and being meta, is not generated but manually written. Because it is meta, i.e., interprets AL not Wasm, it is generic in the actual spec rules (modulo various caveats I mentioned in the talk).

It complements not replaces the existing reference interpreter. In particular, the reference interpreter implements (to the approximation possible) the declarative formal rules, whereas the AL interpreter essentially runs the prose. Both is relevant.

We indeed have discussed the possibility of actually generating (parts of) the actual reference interpreter, but that would be way out at this point. And it is not even clear if it's a good idea to go there, e.g. for process workflow reasons (at least historically, proposals have been implemented and executed in the interpreter long before their spec was written out).

I'm also curious about how valuable other folks think the readability of the generated sources is.

The generated reST and Latex are not significantly worse than the handwritten. For example, here is what currently would be generated for execution of select with macros turned on:

:math:`\SELECT~{{{\mathit{t}}^\ast}^?}`
.......................................

1. Assert: Due to validation, a value of value type :math:`\I32` is on the top of the stack.

2. Pop the value :math:`(\I32.\CONST~{\mathit{c}})` from the stack.

3. Assert: Due to validation, a value is on the top of the stack.

4. Pop the value :math:`\val_{{2}}` from the stack.

5. Assert: Due to validation, a value is on the top of the stack.

6. Pop the value :math:`\val_{{1}}` from the stack.

7. If :math:`{\mathit{c}}` is not :math:`0`, then:

    a. Push the value :math:`\val_{{1}}` to the stack.

8. Else:

    a. Push the value :math:`\val_{{2}}` to the stack.

.. math::
   \begin{array}{@{}l@{}rcl@{}l@{}}
   & \val_{{1}}~\val_{{2}}~(\I32.\CONST~{\mathit{c}})~(\SELECT~{{{\mathit{t}}^\ast}^?}) &\hookrightarrow& {\val_{{1}}
     &\qquad \mbox{if}~{\mathit{c}} \neq 0 \\
   & \val_{{1}}~\val_{{2}}~(\I32.\CONST~{\mathit{c}})~(\SELECT~{{{\mathit{t}}^\ast}^?}) &\hookrightarrow& \val_{{2}}
     &\qquad \mbox{if}~{\mathit{c}} = 0 \\
   \end{array}

Some artefacts (like redundant uses of mathit and unnecessary braces) could still be improved by tweaking the renderer.

tlively commented 5 months ago
  1. Am I correct that the generated sources are not expected to be checked in to the upstream spec repo, but rather treated as ephemeral artifacts of the build process? Normally it would be good to avoid checking in generated sources, but in this case, having the generated sources in source control would help ensure that they are somewhat readable and would de-risk returning to manual editing if that ever becomes necessary. Are there other arguments for or against checking in the generated sources?

Right, currently they are just ephemeral, but of course it would be easy to check them in. Not sure what advantage that would have, though. I don't think the ability to potentially snapshot them in the future depends on that?

My thought was that having them checked in (and therefore visible in code review) would encourage us to keep them readable, although I guess this is not as useful for normal changes to the spec compared to changes to the renderer. I agree that we could make a snapshot either way.

It complements not replaces the existing reference interpreter. In particular, the reference interpreter implements (to the approximation possible) the declarative formal rules, whereas the AL interpreter essentially runs the prose. Both is relevant.

Makes sense. If the reference interpreter is not being replaced for now and will remain part of the proposal authoring process, then that resolves any concerns I had about the meta-interpreter.

Thanks!