rethink, clarify, and/or fix semantics of `transcript` vs `transcript.fork`, with `-c` / `-s`

aryairani commented 5 months ago

The ucm help currently shows

  transcript               Execute transcript markdown files
  transcript.fork          Execute transcript markdown files in a sandboxed codebase

but @stew observed that transcript ignores --codebase while transcript.fork doesn't.

Discussing with him, I wasn't even sure what the semantics were meant to be, but the current behavior is unintuitive and the help vague.

sellout commented 2 months ago

I think I discovered the intended semantics in another issue from @ceedubs: #3314.

The description there implies that transcript.fork is meant to run the transcript on a copy of whatever codebase would be used by UCM otherwise (i.e., either one provided by -c or the default), while transcript creates a fresh codebase regardless (and so maybe -c along with transcript should warn or error).

The summaries in ucm help are definitely not clear enough.

aryairani commented 2 months ago

Just exhaustively, we can:

(a) start with a fresh codebase, or (b) an existing one (transcript vs transcript.fork [-c])
(1) modify it in-place, or (2) modify a copy (we don't currently support (1), only (2))
(x) throw the result away, (y) save the result somewhere (default vs -S)

The product of these make 8 combinations, and some of them overlap or don't make sense:

(a1x) - use new codebase, then discard; this is the default
(a1y) - use new codebase, then save; this is also legit
~a2x~ a1 and a2 are the same?
~a2y~ "
~b1x~ doesn't make sense: if you modify in place, then the result is in-place
~b1y~ similar to b1x
(b2x) - start from existing codebase, don't modify original, discard the resulting codebase
(b2y) - start from existing codebase, don't modify original, keep the resulting codebase.

Currently we're using b2y to emulate b1, which is unintuitive.

So I think we really want:

transcript, which is ax
transcript --save <dest>, which is ay
transcript --in-place <path>, which is b1; could have an optional flag to specify what to do if <path> doesn't exist.
transcript.fork --src <src>, which is b2x
transcript.fork --src <src> --save <dest>, which is b2y

sellout commented 2 months ago

Ok, that makes sense.

There are also tradeoffs with how -c/-C are interpreted when transcript* is used. It can be nice to make options orthogonal, so that -c has the same effect regardless of anything else that happens. On the other hand, making --src distinct from -c avoids the possibility of accidentally modifying your codebase with a typo. I think the latter is mitigated by transcript* requiring --in-place to overwrite, and the ability to roll back a code base using the VCS functionality.

So, I have a half-baked alternative proposal that starts by clarifying/modifying -c and -C to mean roughly the same as --src and --dest, but independently of transcript*. I.e.,

-c <src> – as now, use as the codebase instead of the default
-c <src> -C <dest> – use as the codebase, forking it from
-C <dest> – this is the half-baked part. Currently this means to use a fresh as the codebase, but would presumably now mean “use as the codebase, forking it from the default codebase”, and there is no way to say “make a fresh empty codebase” … maybe using -c with a path that doesn’t represent a codebase (but it’s probably better to have some explicit way to say “I want a new code base” to prevent confusing behavior when there’s just a typo).

But, assuming there’s a clear solution there, transcript* can then build on those interpretations rather than adding new --src and --dest options. Then your list of what we want could be

transcript
-C <dest> transcript
-c <path> transcript --in-place (--in-place is now simply a flag)
-c <src> transcript.fork
-c <src> -C <dest> transcript.fork

This difference between transcript and transcript.fork is then “what happens when no explicit-c is passed?” transcript says “make a fresh one” and transcript.fork says “copy the default one”. And I think all the other combinations would result in identical behavior between transcript and transcript.fork. So maybe remove transcript.fork and then figure out how to disambiguate that one case – maybe a new flag, or maybe handle -c <<default>> or something as a special name to indicate the default codebase, so it can be made explicit.

And maybe --in-place is just a consequence of having -C point to the same codebase as -c (and maybe have -C <<src>> be a special name to indicate that it should be the same, which is also useful when the “src” is the default codebase).

Now the default values could be -c <<default>> -C <<src>>, so omitting the flags aren’t special cases (making flag omission be equivalent to a particular explicit flag is also very helpful to be able to override flags explicitly set earlier, say by an alias or other program/script that calls ucm).

And just to get crazy, what if -c and -C were just file descriptors, like ucm transcript 4</path/to/codebase 5>/path/to/output. There are probably a lot of issues with that (codebases are directories, not files; what does 5>> mean, etc.)

Anyway, like I said, half-baked, but maybe there’s something useful to extract from this.

unisonweb / unison

rethink, clarify, and/or fix semantics of `transcript` vs `transcript.fork`, with `-c` / `-s` #4809