unisonweb / unison

A friendly programming language from the future
https://unison-lang.org
Other
5.81k stars 271 forks source link

`@typecheck` fenced code block Doc elements randomly swap example code #3485

Open rlmark opened 2 years ago

rlmark commented 2 years ago

This issue with Doc elements was surfaced earlier this week. Code examples in @typecheck fenced codeblock elements are swapped with other code examples from @typecheck block elements, leading to an odd docs reading experience.

The example code chosen in the doc is wrong for docs.to-html and display. The view command displays the source appropriately and the source code is correct, so this appears to be specific to the process of rendering the doc for a viewer/html.

The example chosen is deterministic per command run, so display myDoc will always select the same example to render.

It's most visible on the website (large docs based codebase) with multiple @typecheck fenced codeblock elements. I was unfortunately not able to repro this in the small case with only a few docs. A live example of this issue is located in the website codebase, in the term website.learn.fundamentals.controlFlow._dataTypesPatternMatching

The following commands will let you see the problem in doc rendering:

.> pull unison.public.website.learn website.learn
.website.learn> docs.to-html fundamentals.controlFlow._dataTypesPatternMatching
.website.learn> display fundamentals.controlFlow._dataTypesPatternMatching

The source code reads:

       @typecheck ```
       placeSetting : Lunch -> [Utensil]
       placeSetting = cases
         Soup soupName   -> [Spoon]
         Salad saladName -> [Fork, Knife]
         _               -> [Spoon, Fork, Knife]

However, the code snippet chosen on `display` is:  
@typecheck ```
getByteSize : (Text, Bytes, Text) -> Nat
getByteSize tuple3 =
  "tuple decomposition 👇"
  let
    (first, bytes, last) = tuple3
    size (toUtf8 first ++ bytes ++ toUtf8 last)
```
dolio commented 2 years ago

I did some investigation. Here's what seems to be happening to me.

When you write something like:

@typecheck ```
myDefn : T -> U
myDefn = cases
  ...
```

What it turns into is something like:

docConstr let
  myDefn : T -> U
  myDefn = cases
    ...
  '()

(Possibly there is any Any around the '().) So, the definition you want to see actually turns into an expression that would evaluate to '(), but in a context where your definition happens to be.

Both the definition you want to see and the '() are lambda expressions, so they're going to get floated. The problem is that once we float myDefn, every one of these special expressions becomes identical. It's just '(). When we float that part, we hash the expression, but they're all now identical, so they get identical hashes. That wouldn't be a problem, because myDefn is completely irrelevant to the value of the value of the expression, except the purpose of this is to see information about myDefn.

The reason you see random other definitions is because there is effort made to decompile to the original source code. So, decompile info is generated for all of these expressions that actually contains myDefn and so on. Then they all hash to identical hashes for '(), and only one can be remembered, so you get one at random depending on the order things were compiled in. I think you would always get just '(), except the decompile info picks the first one that was added, rather than the last one.

So, my suggestion is that this is not a reasonable way to represent this document. It depends on the compiler/runtime not being allowed to perform meaning-preserving transformations on expressions. After evaluation, we are left with something like SomeDoc (_ -> ()), and the decompiler is expected to reproduce distinct source expressions depending on where the _ -> () came from. But to do that, there would need to be a distinct copy of the constant function for each source location, and they could never be coalesced.

Instead, myDefn should be relevant to the actual value. So, something more like:

docConstr let
  myDefn : T -> U
  myDefn = cases ...

  Any ('myDefn)
pchiusano commented 2 years ago

@dolio makes sense, thanks for looking into this.

@rlmark a workaround for now, either use the @source element or make sure that @typecheck blocks have a final expression that uses any definitions introduced, for instance:

@typecheck ```
id : a -> a
id x = x

id 42 -- if we don't have this here, we get this bug


> Instead, myDefn should be relevant to the actual value. So, something more like:

This seems possible to automate as a medium term fix. We just need some expression that can make use of the bindings but that the renderer knows to ignore. Then the parser can produce such an expression at the end of a `@typecheck` instead of the `()`.