Make transcripts idempotent

sellout commented 1 month ago

Overview

This contains a lot of improvements to transcripts. The tl;dr is that now transcript outputs can be used as transcripts, so bug reports can contain just the output (and the output can be edited down to just the relevant bits), rather than including or attaching separate transcript and output.

Some more detail

all serialization is now down via CMark (previously, all api and ucm blocks in the output were still being serialized directly, because of the interleaving of the input text and command/request output)
- previously some ucm blocks needed more backticks, but still only had three because of this, so that’s now fixed
transcript parser can now parse :added-by-ucm
transcript parser can now parse ucm and api outputs (api outputs are now indented similarly to ucm outputs to make this easier)
unified handling of :hide(:all), :error, and :added-by-ucm – previously each block time handled them with different semantics, including that for ucm, :hide and :hide:all had the same behavior[^1]
info tags (e.g., :hide) are now preserved in the output
failing transcripts (but not yet parse errors) result in valid Markdown (i.e., the error is wrapped in a fenced block to preserve its formatting)
updates the bug_report template to encourage putting the transcript directly into the issue, rather than fencing it in a ``` markdown block

[^1]: You’ll see in the diffs a bunch of “new” ``` ucm :hide blocks. These were there before, but not copied to the output because of the bug where :hide and :hide:all were treated the same for ucm.

Caveats:

failing transcripts are not yet idempotent (but :error is)
:hide:all is (intentionally) not idempotent

Fixes #5199.

Implementation notes

This changes a bunch of stuff in the transcript implementation. It tries to make some things clearer (e.g., breaking awaitInput into smaller functions, and removing the recursion from it), but there is still work to do there.

Interesting/controversial decisions

This PR currently makes most of the transcript tests into “idempotent tests” – where the output is written back to the input file, and there should be no changes when re-run. It makes a nice impact with 11 kloc removed, but I don’t know that we actually want to do this.

One reason not to is that the transcripts are much noisier, since all the output is included. But maybe this should be handled by using :hide more liberally. I added a commit at the end that does that for a few transcripts.

Test coverage

Every transcript is touched by this PR, I’m pretty sure.

Loose ends

There is related work here that doesn’t need to be in this PR

more simplification of the transcript runner
dealing with parse errors better
probably some other things I’m forgetting

sellout commented 1 month ago

Hrmm … the tests that are failing … they’re easy enough to fix (just need to update the output expectations), but it seems like there’s not an easy way to run these tests. I.e., they’re not part of ./scripts/checks.sh or anything else. Is that right? If so, we should probably add scripts for those jobs.

sellout commented 1 week ago

Can you suggest a way we can be confident that all of the transcripts made it into idempotent/ and not accidentally deleted like I'd apparently done in the past?

So I tried to split up the commits to make this more obvious, but I don’t think it’s foolproof:

https://github.com/unisonweb/unison/pull/5406/commits/cf0f3e25658153d6cafae7a6f513db5849cffe69 removes all of (and only) the outputs for now-idempotent transcripts (and moves the idempotent ones to a new subdir), then
https://github.com/unisonweb/unison/pull/5406/commits/73b2a9c8574aa1b4007093c471bb746ba36b31c6 runs everything, producing the “idempotent” versions of the transcripts.

So, basically, verify that only output files were removed in that first commit. And I think no other commits should remove any transcripts. If they do, they should be spot-checkable (and I apologize for not isolating it better).

aryairani commented 1 week ago

So I tried to split up the commits to make this more obvious, but I don’t think it’s foolproof:

Ah, perfect, thanks, I'll take another look.

aryairani commented 1 week ago

I'm about ready to merge this, but still just trying to wrap my head around and document the new organization. Let me know if I have this right.

It looks like:

basically all transcripts are grouped under either errors/ or idempotent/
there are a handful of transcripts in the root unison-src/transcripts that either we're not trying to convert right now because they use :hide:all or like in the case of fix-5402.md I assume slipped in from trunkafter the main work was already done.
there are still some transcripts under transcripts-using-base/ that we don't mess with yet
there's project-outputs/ which includes some github templates and also any markdown from ./docs/
transcripts-manual/ and transcripts-round-trip/ are done currently with a different workflow, no change there
transcripts-using-base/ we also don't change in this PR, but might want to look at later to give a similar treatment to.

I'm kinda thinking we can eventually make in-place the primary mode for running the transcripts, but we don't have to figure that out now.

And this PR mainly is to let us reduce how many files we've got, and also to reduce friction in submitting transcripts with bug reports.

sellout commented 1 week ago

I'm about ready to merge this, but still just trying to wrap my head around and document the new organization. Let me know if I have this right.

It looks like:

basically all transcripts are grouped under either errors/ or idempotent/

And the errors aren’t currently handled idempotently because parse errors don’t preserve the transcript. Runtime errors do, and so those could be idempotent, but I’d rather just get the parse errors handled better, then make them all that way.

there are a handful of transcripts in the root unison-src/transcripts that either we're not trying to convert right now because they use :hide:all or like in the case of fix-5402.md I assume slipped in from trunkafter the main work was already done.

Correct about :hide:all, but ooof, I thought I caught ones that were added in trunk. I can make sure those are moved.

there are still some transcripts under transcripts-using-base/ that we don't mess with yet

there's project-outputs/ which includes some github templates and also any markdown from ./docs/

Yeah, I didn’t have a great name for this directory, so feel free to suggest a different one, but these are meant to validate anything transcript-y that lives in the repo for some reason other than being a test. I.e., these test that those transcripts don’t change incompatibly rather than testing that the Transcript.hs implementation is correct.

transcripts-using-base/ we also don't change in this PR, but might want to look at later to give a similar treatment to.

Yep – maybe we split each type of transcript tests (“basic” (which currently doesn’t have an explicit directory), “using-base”, and “errors”) into “idempotent” and “non-idempotent” subdirs, or we can dynamically identify whether each transcript uses :hide:all (this could be returned by the runner, since we write the output file all at once after running).

I'm kinda thinking we can eventually make in-place the primary mode for running the transcripts, but we don't have to figure that out now.

And this PR mainly is to let us reduce how many files we've got, and also to reduce friction in submitting transcripts with bug reports.

There is one thing here that I want to call out:

While the transcripts are idempotent (modulo parse errors & :hide:all), the first run can still discard part of a transcript. E.g.,

``` ucm :hide
scratch/main> merge.builtins

This should fail to parse, but pretend it’s a typo rather than this arbitrary text.

scratch/main> add

results in this output
```` markdown
``` ucm :hide
scratch/main> builtins.merge

This should fail to parse, but pretend it’s a typo rather than this arbitrary text.

🛑

The transcript failed due to an error in the stanza above. The error is:

I found a ',' here, but I didn't see a list or tuple that it might be a separator for.

    1 | This should fail to parse, but pretend it’s a typo rather than this arbitrary text.


So, running it in place by default can be annoying when writing a new transcript. What I currently do is stage my transcript, then run it and if anything is lost, just discard the unstaged changes.

In future, transcript runs could be “non-destructive” – i.e., make sure that all of the original stanzas are included in the output, even if there was an error earlier in the run.

aryairani commented 1 week ago

There is one thing here that I want to call out: While the transcripts are idempotent (modulo parse errors & :hide:all), the first run can still discard part of a transcript.

Ooo, okay. A little dangerous. So I guess for now if you put it under idempotent/, you'll run that risk that an unexpected error could clobber the rest of the transcript. So instead you need to put it in transcripts/ or run it manually until you know it works, and then move it to idempotent later. Or stage it like you mentioned.

sellout commented 1 week ago

And this PR mainly is to let us reduce how many files we've got, and also to reduce friction in submitting transcripts with bug reports.

Yeah, I would say “better bug reports” are the main thing here.

What I like to do is run my transcript, paste the output.md into the issue description, then edit the output in the UCM blocks. There can be a lot of output and :hide (while definitely useful) can be a bit coarse. So I could have a bug report like (NB: This is not a real bug)

I can get the help for `alias.type`

``` ucm
scratch/main> help alias.type

  alias.type
  `alias.type Foo Bar` introduces `Bar` with the same definition as `Foo`.

but when I run just help, that command is missing

scratch/main> help

  ...
  alias.many (or copy)
  `alias.many <relative1> [relative2...] <namespace>` creates aliases `relative1`, `relative2`, ...
  in the namespace `namespace`.
  `alias.many foo.foo bar.bar .quux` creates aliases `.quux.foo.foo` and `.quux.bar.bar`.

  alias.term
  `alias.term foo bar` introduces `bar` with the same definition as `foo`.

  api
  `api` provides details about the API.

  auth.login
  Obtain an authentication session with Unison Share.
  `auth.login`authenticates ucm with Unison Share.
  ...


where I’ve trimmed 90+% of the `help` output to just focus on the problem, but without having to also include a separate input transcript, which breaks the flow of the report (and also results in context-free transcripts when that chunk is copy/pasted from the issue on its own).

unisonweb / unison