Automated Snippet Checking

DrewFenwick commented 3 years ago

At some point we want to have our Haskell snippets automatically compiled and checked as part of the CI checks to reduce the likelihood of releasing erroneous code examples.

There are a number of ways to make documents that serve as both document markup and compilable Haskell source. What would be particularly handy is being able to directly compile markdown embedded snippets, since we're already using markdown.

As well as checking that Haskell file style snippets compile, It would be beneficial to be able to check that our GHCi session snippets are also correct.

There are no guarantees that a GHCi snippet that is correct at the time of writing will be correct forever. Here are some of the ways a dependency update could break our GHCi snippets:

A term could be changed or removed, leading to a GHCi error.
A term could be added, leading to an ambiguity error or non-exhaustive pattern match error.
A term's meaning could change, leading to a different output and making the documentation out-of-date.
A new major compiler version could alter syntax, type checking or language extension behavior, leading to a GHCi error.
A compiler update may change error messages or make formerly erroneous code compile, out-dating error example snippets.

It would be ideal to be able to detect all of these situations in the CI checks.

DrewFenwick commented 3 years ago

These are the results of my own research into how we might automatically compile and check our documentation.

File Style Snippets

markdown-unlit acts as a drop-in replacement for GHC's literate preprocessor to allow GHC to extract the Haskell snippets in markdown files.

That sounds ideal for compiling haskell file style snippets, but it comes with some minor complications to the build process:

A contributor will have to install an extra tool compile the haskell snippets.
GHC only compiles .hs and .lhs files, so we have to set up symbolic links to .md files to fool it.

Since we won't want to worry about manually making sure there is a symbolic link for every .md file, we would probably want a custom build script that creates them. This procedure will be OS dependent, so we need to think about if/how we enable the project to be built on multiple platforms.

GHCi Style Snippets

A lot of our code snippets are going to be example GHCi sessions, which means markdown-unlit will be unhelpful for automatically checking them.

Auto-inserting GHCi results

pandoc-markdown-ghci-filter is a pandoc filter that pipes haskell snippets into GHCi, and inserts the results into the output document.

If we could incorporate this into our build process then we'd have a guarantee that the output included in our lessons is what GHCi actually outputs, even as we update our dependencies and compiler versions.

Poking around the source, it looks like it only uses stack ghci to start a GHCi session, which as a Cabal user I think is less than ideal! It's a small and simple program though, so forking and improving it may be feasible.

Testing GHCi results

At best pandoc-markdown-ghci-filter assures us that our GHCi output snippets contain what GHCi actually outputs. What it doesn't do is tell us whether GHCi outputs what we expect it to. I haven't yet come accross any tools that can test this.

We could just add a test suite that tests assertions on Haskell expressions rather than GHCi outputs, but if we want to be sure the GHCi output is correct we would have to try to manually make sure those tests always reflect the GHCi snippets.

Maybe we could make a code review policy that whenever a PR adds or changes a GHCi snippet, the reviewer has to check that the tests still adequately assure that a test will fail if GHCi wouldn't return what it is expected to return?

A potential complication with this approach is that the test suite can only access terms from file-style snippets. Expressions that only appear in GHCi entries will have to be duplicated in the test, which may incentivize us push more definitions into the Haskell file style snippets rather than GHCi snippets.

DrewFenwick commented 3 years ago

An alternative option for file style snippets: We could write all haskell example code in a separate plain haskell file then use pandoc-include-code to insert sections of the file into our documents

Pros:

All standard Haskell tooling will work for writing, formatting and testing non-ghci snippets.
Testing non-ghci snippets will be easier.

Cons:

Writers will have to hop between prose and source files to define one lesson.
The document build process will become more complicated.

A similar approach could be used to extract ghci input snippets to a separate file, but the output would have to be written elsewhere and I'm not sure if the input and corresponding output could be placed in the same code block.

Kleidukos commented 3 years ago

I suggest that for the moment we try and implement the solution with pandoc-markdown-ghci-filter. Can you whip up a prototype @DrewFenwick ?

DrewFenwick commented 3 years ago

@Kleidukos I'll see what I can do.

haskellfoundation / HaskellSchool