unisonweb / unison

A friendly programming language from the future
https://unison-lang.org
Other
5.78k stars 270 forks source link

in docs, single asterisks (italics) are parsed as double asterisks (bold) #5255

Open mitchellwrosen opened 3 months ago

mitchellwrosen commented 3 months ago

This transcript saves a {{ *hello* }} doc, then views it, and it looks like {{ **hello** }}.

scratch/main> builtins.mergeio

  Done.
foo = {{ *hello* }}

  Loading changes detected in scratch.u.

  I found and typechecked these definitions in scratch.u. If you
  do an `add` or `update`, here's how your codebase would
  change:

    ⍟ These new definitions are ok to `add`:

      foo : Doc2
scratch/main> add

  ⍟ I've added these definitions:

    foo : Doc2

scratch/main> view foo

  foo : Doc2
  foo = {{ **hello** }}
mitchellwrosen commented 3 months ago

Ah ha, I've peeked at the implementation of the lexer.

It seems we treat any number of (matching) asterisks as bold, underscores as italic, and tildes as strikethrough.

Examples:

*bold*
**bold**
***bold***
_italic_
__italic__
___italic___
~strikethrough~
~~strikethrough~~
~~~strikethrough~~~

That seems fine to me? Perhaps the fix is to just update the website: https://www.unison-lang.org/docs/usage-topics/documentation/#:~:text=Documentation%20blocks%20start%20with%20%7B%7B,an%20expression%20can%20be%20written.

mitchellwrosen commented 3 months ago

The pretty-printer should also probably prefer the single-character *bold*, _italic_, ~strikethrough~.

It currently renders as **bold** which implies single asterisks are reserved for something else.

sellout commented 1 month ago

Yeah, this confused me at first as well (especially since the docs don’t match the behavior). I prefer Unison’s implementation of this, but also understand that users may be expecting Markdown-compatible syntax, since much of the rest is compatible.

So yes – one of two bugs here: either fix the docs & pretty-printer or change the implementation. I don’t know which would be better.

aryairani commented 1 month ago

I think it'd be better to move towards Markdown-compatible syntax, but what is Markdown?

Here's Discord's version:

image

https://support.discord.com/hc/en-us/articles/210298617-Markdown-Text-101-Chat-Formatting-Bold-Italic-Underline#h_01GY0DAWJX3CS91G9F1PPJAVBX

Here's Github's version, which seems insanely complicated but maybe it's necessary: https://github.github.com/gfm/#emphasis-and-strong-emphasis

sellout commented 1 month ago

GitHub is CommonMark + extensions, and I think leaning toward CommonMark is the way to go.

  1. it’s the closest thing to a standard; and
  2. it’s what we‘re using for transcripts (and GH compatibility is useful there), and settling on one dialect is good, IMO.

We’ve talked about this at least in passing before, but it could be possible to use an existing CommonMark impl for Doc2 the way we do for transcripts, since they generally have some extension mechanism for things like transcludes. However, it’d be a bigger project since the Unison side of things would also need to change to match.

And bringing it back to this issue in particular … there are a few things to keep in sync when making changes to Doc2, since it crosses the Haskell/Unison boundary in some unique and delicate ways. I’ve been meaning to enumerate exactly what needs to change in concert there …

aryairani commented 1 month ago

CommonMark sounds good. So then the website is correct, but should add bold+italics as triple asterisk or triple underscore, and the parser should be updated to match.