unisonweb / unison

A friendly programming language from the future
https://unison-lang.org
Other
5.65k stars 266 forks source link

RFC: commenting and documenting code #462

Closed pchiusano closed 4 years ago

pchiusano commented 5 years ago

See https://github.com/unisonweb/unison/blob/master/docs/comments-and-docs.markdown which is a bit rough still.

Feel free to leave any comments here, also be sure to view raw source for some embedded comments.

ChrisPenner commented 5 years ago

Some more questions here:

Would comments on isomorphic definitions be merged? E.g. If I write:

-- Either is a sum type which....
data Either b a = Left b | Right a

-- Validation represents the result of...
data Validation e a = Error e | Success a

These two definitions would have the same hash; but obviously shouldn't share documentation. Seems like this is a tricky problem to solve.

Should we have more than one syntax for comments? Since (most?) people use editor tooling to 'comment' things nowadays, maybe it makes sense to JUST have block-style comments and simplify the language?

I like the idea of 'tagged' comments using something like {blah| ... |} where blah represents something interesting; perhaps this could be used to add additional metadata other than comments to definitions in the future. It might also be handy for something like {example| ... |} where the contents would be run through some sort of doc-test style thing.

I think linking between docstrings should probably be explicit (using @ or something); otherwise someone adds the and definition to the codebase and suddenly your docs are a total mess of hyperlinked garbage. Also it's much faster and easier to implement.

I think it makes sense to keep comments relatively simple for now; most of the documentation features can be built on top of the 'plain-text' using markdown parsers etc.

{|
---
language: en
---
@foo is a function which adds @x to @y. 

- @x: The first number to add
- @y: The second number to add

See also @multiply

- [ ] TODO: Handle subtraction

```example
foo 1 2
> 3
```_
|}
foo x y = x + y

I think it's pretty readable, and there's a lot of prior-art around markdown so it's pretty easy for others to extend.

aryairani commented 5 years ago

@ChrisPenner Independent comments on the same definition could be merged (optionally prefixed with the comment-author's initials/github username/etc) or selected among when they are displayed.

atacratic commented 5 years ago

Seems really good to me!

I guess more thought needed around metadata.

Also around the multi-user experience: what about reconciling different changes to comments, adopting other people's changes and whether that replaces what you had before, pulling in the same hash from different sources where each source has different comments.

I guess you can refer to an uncommented definition unambiguously, by hash. But can you refer to a definition, including a set of comments, unambiguously, with some identifier/URL? Maybe only with respect to a given branch? (And if not, is that a problem?)

I think being opinionated about what goes into API documentation would be a good thing. Setting a specific and high standard and enforcing it will improve everyone's experience of working with Unison code. I think you should work out now/soon what standard API documentation will look like, with a couple of worked examples.

Is each fragment of information actually its own tagged piece of API documentation? (So the 'usage' goes in its own API comment, and the 'example', and the 'purpose', or whatever. Concatenated together again when rendered.)

Wherever docs have code (in Markdown between fences or backticks), Unison should parse that code, resolve names, and substitute hashes for names.

How about typechecking it too? Would that make sense?

Will it be possible to put a picture in one of these docs? The markdown could have a link in it I guess, but what are the options for hosting the image? Would it be crazy to have 'assets' like that in the codebase? Not expecting that to be a day 1 feature, but I'd be interested in your view as to whether it's desirable-in-theory. I think we should be able to say "one day it will be possible to put images in unison tutorials/documentation".

And similarly what about widgets that can be rendered somehow, by running Unison code? Do we see that as possibly desirable and in scope of the Unison documentation facility one day? (I'm thinking iPython and 'explorable explanations' type stuff.)

I'd actually vote against the @ in `@foo x y' (you've already got the backticks to stop indiscriminate parsing of prose as unison names). [edit: I note that the previous sentence has the effect of tagging a person on github into the thread...!]

Do people have to put newlines in their comments every 80 characters? Or are they rendered with 'word wrap'? I'm guessing+hoping yes - word wrap. (How does it look afterwards if people do manually line-break their prose with newlines? They might well do that since they'll probably be using a code editor.)

Will you be able to comment any node in a term's syntax tree, or only statements within a block? Can someone comment an individual parameter of a type signature, say?

aryairani commented 5 years ago

Thanks @atacratic — We were thinking the comments would go alongside the code definitions, rather than be part of it. Selecting a particular comment (or comment set, not sure how we'll do this) would imply a particular definition.

How about typechecking it too? Would that make sense?

Yes, especially for anything that has to execute at build time, like examples, or doctests.

I can certainly imagine docsets at least referencing images, especially if we move to a markdown format in the future. As to whether the codebase should include the images for the docset, I'd lean no, but would be open to arguments for it. The Github model, where uploading images into your markdown seamlessly sends them to a CDN and sticks an https link into your markdown.

OTOH, I do expect to see an ipython-like interface at some point, with rendered widgets and output graphs and so forth, so is an image really any different? Do all big datasets go in your unison codebase? What impact does that have on code sharing times? Anyway, maybe it's just a question of "when".

A reason I like the @ in @foo x y is that we're asking for @foo to be parsed as an existing definition, and not trying to parse x or y as existing definitions. Otherwise how do we tell?

Yes to word wrap. How does it look afterwards if the do manually wrap? Bad, unless we unwrap it. We do have code for it. This makes me think we should only have block comments, and use ``` fences or something to denote preformatted text, quotes, poems.

We were thinking you'd be able to comment any node in the AST, including individual parameters of a type signature.

Still needs work though.

atacratic commented 5 years ago

Sounds good @aryairani.

I see your point about the @foo x y syntax. I guess you could do it the other way round, and say that it's the free variables that get the @ so like foo @x @y. Both seem a bit annoying to me - that visual noise of the @ that makes it not look like Unison code any more. All we're basically trying to do is convey some names to treat as free (I think?)

Maybe you could do it with something like

{| `x y ->`
Usage: `foo x y` adds `x` to `y`.
|}

so, naming the free variables as if the comment is surrounded by a lambda. Maybe also janky but at least it avoids the syntactic noise in the actual comment.

(This is one of those questions that would evaporate if we had a semantic editor! Which is maybe a clue not to worry about it too much.)

A fourth option would be to say that any name that is not found in the ambient namespace is treated as free. That's unsafe if there is a typo, but on the other hand the damage isn't too great - you just miss out a hyperlink. I'd be inclined to think further along those lines and try and come up with something.

(Are these embedded Unison expressions themselves going to admit comments?)

aryairani commented 5 years ago

Right — we're trying to indicate either which names to treat as free or which names to treat as bound.

With respect to the noise of the @, it wouldn't have to appear if the comments were rendered in HTML or the like; just in a textual representation.

We could also do:

{foo bar| -- indicating symbols we want replaced with references
Usage `foo x y` adds `x` to `y` and is much more efficient than `bar x y`.
|}

Your fourth option seems reasonable, my only worry is accidentally capturing an x from the ambient namespace, but maybe that isn't a big deal.

Fortunately, as long as the structure of the comments is one that we like, we can revise how they are parsed and printed without doing too much damage. It seems like if we're going to do any linking between definitions at all, then we can't store it as simple text — we want the text to embed references, etc. Or maybe we can still do that with a plain text representation, by looking for #hashes in backticks.

aryairani commented 4 years ago

@seagreen recommended looking at https://docs.racket-lang.org/scribble/ for inspiration.

aryairani commented 4 years ago

@pchiusano Close?

pchiusano commented 4 years ago

Yea