Declarator Docs should be limited in scope

lizmat commented 2 months ago

See current definition and the original thoughts.

This issue is written out of frustration of trying to implement a more sensible way of dealing with declarator docs in RakuDoc v2, and trying to implement a sensible "safe" renderer.

Regardless of the implementation effort that has gone into this, and what still needs to go into, I wonder how many developers really like this feature, and what they would think if this feature would be removed by default in 6.e.

lizmat commented 2 months ago

One reason for the existence of declarator docs, is apparently the perceived usefulness of being able to provide external documentation for code with the code itself.

In my opinion, internal and external documentation are two very different beasts. And declarator docs make it way too easy to mix the two. Thereby either creating documentation with limited usefulness for the maintainer, or for the user. Or in the worst case for both.

lizmat commented 2 months ago

Another reason for the existence of declarator docs, is that with a refactoring, it would be easier to update the external documentation as well because it is closer to the code.

In my opinion, a refactor would probably also require explaining the before and after situation in the user documentation. And this would just add clutter to the code for a maintainer.

lizmat commented 2 months ago

The third reason for the existence of declarator docs, is that it would make it easier for the developer to maintain the internal and external documentation.

In my experience, that is NOT true. Personally, I have a mindset for writing user documentation. And one for development. They rarely are active for me at the same time. Sometimes I will write user documentation first, as a sort of design of the features.

And sometimes the development starts first, iteratively, without a clear view of the final stage. Once it's getting to a beta stage, would be the first time user documentation would be written. With a different mindset from developing, attempting to look at the code from a user point of view.

lizmat commented 2 months ago

thoughts? @finanalyst @thoughtstream @niner @ab5tract

raiph commented 2 months ago

Hi @lizmat,

My heart emoji on your opening comment was for empathy with your frustrations and gratitude for all that you've done on RakuAST related design and implementation.

In this comment I'll react to what you said in follow up comments.

I'll use the word "decks" as a shorthand for declarator blocks.

internal and external documentation are two very different beasts. And declarator docs make it way too easy to mix the two. Thereby either creating documentation with limited usefulness for the maintainer, or for the user. Or in the worst case for both.

Two key use cases come to mind. It could be instructive to consider alternatives to these two:

Killer feature for CLIs.
Useful for IDE tooltips.

Other PLs have alternatives. Does it make sense to relinquish decks for both the above cases because there are alternatives that are adequate, or perhaps even as good as, or perhaps even better, than (Raku) decks? Anyone know?

a refactor would probably also require explaining the before and after situation in the user documentation.

Off the top of my head I can imagine (eventually) creating nice three way diff tooling that extracts decks before, after, and reason for change, from commits and their messages.

I have a mindset for writing user documentation. And one for development. They rarely are active for me at the same time.

I'm half with you on that.

But I've always presumed decks were part of user doc mindset. Typically written in a fairly throwaway mindset, at least initially, when prototyping or throwing together CLIs, and written in a relatively thoughtful mindset, added rather late to the party, for most other code.

There's nothing to stop a dev from writing non deck comments (so using just #, not #| or #=) and there being tooling that supports someone writing user doc extracting such non deck comments that are in the same syntactic positions as decks and making them available for cut/paste to the user doc writer. Downsides would include the usual problem of things getting out of sync if the tool wasn't used diligently, and that there would sometimes be false positives or negatives even if it was. An upside would be eliminating the problems decks appear to be creating.

Talking of deck problems, and indeed docks in general ("docks" is my alias for RakuDoc in general), another one perhaps worth mentioning is the need to be able to undock (compile a module so that it no longer carries its dock load inline inside the bytecode instead of just a stub/link) and, conversely, redock (recompile undocked compiled code to a docked equivalent that inserts the downloaded or locally stored but external dock source).

Just some food for thought, mostly focusing on possible pros of decks rather than their cons.

lizmat commented 2 months ago

undock (compile a module so that it no longer carries its dock load inline inside the bytecode instead of just a stub/link)

Pretty sure a pragma could be devised for that.

conversely, redock (recompile undocked compiled code to a docked equivalent that inserts the downloaded or locally stored but external dock source)

What would be the purpose of that?

I think that generally the problem is really that variables like $=pod and $=rakudoc are only available from the program itself, and cannot be read externally?

raiph commented 2 months ago

Pretty sure a pragma could be devised for [undock]

Perhaps? But I don't think so. Or perhaps I should I don't understand why there would be one, and it's clearly (to me) not something to start off with. I suspect there's a misunderstanding.

My description of what I meant was brief, and perhaps inadequate. Consider this code:

=SYNOPSIS
blah blah
=end SYNOPSIS

say $=dock # STDERR: foo's documentation was not included in its compilation...

What I mean by "undock" is compiling foo so that Rakudoc is entirely ignored.

My presumption is that the first step, if what I'm describing were to ever become a thing, would be creation of a Rakudo plugin that makes the AST generation stage (or, perhaps more likely, some stage after that but before bytecode generation) strips out Rakudoc.

My presumption in the imagined scenario in the code above was that the code was compiled (with raku) with that Rakudo plugin, and the plugin arranged for the $=dock (or $=rakudoc or whatever) variable to be (re)initialized to some "stub" value that caused a warning to be displayed. (Due to the stub value being treated as if it were an ordinary value.)

If undocking were ever worth implementing, it would be for code whose compiled form would not include its Rakudoc. The key use case would be when uploading code to a repo. I'm imagining that almost all Raku code is uploaded undocked (as part of a package that either does or doesn't include the undocked doc as one or more separate files).

What would be the purpose of [redock]?

It would be the inverse of undock. If, for example, someone runs some Raku code, and they got a "documentation was not included..." warning, then they'd redock so the warning went away and the documentation appeared as it would if the code hadn't been undocked in the first place. This redocking would involve either recompiling code or changing the behavior of the $=dock stub value to lazily load the doc, either from a local source or by downloading and installing the doc from a repo out in the cloud.

I think that generally the problem is really that variables like $=pod and $=rakudoc are only available from the program itself, and cannot be read externally?

That's a different problem.

Iirc someone already attempted to address the problem you describe. So perhaps (though I doubt it) someone has done most of the work for an undock/redock, or, conversely, a half decent basic undock/redock solution would inevitably be almost all of what would be necessary to also solve the problem you just described.

thoughtstream commented 2 months ago

Way back in the Jurassic, when I designed the original Pod6, I was entirely neutral about the idea of declarator blocks. I was asked to add the feature, but I have never used it myself, nor to I particularly like the idea of intermixing documentation and code (see also Perl Best Practices, Chapter 7).

So I would have no problem if we removed the entire concept of declarator blocks.

However, before we contemplate that step, perhaps it would be useful to know how many people (if any) are actually using the feature – and the related .WHY method – in their code. I tried to search for .WHY, #|, and #= on raku.land, but seemed to get mostly just false positives. If anyone knows a better way to determine how widely used the feature is, that would be a very useful contribution to this discussion.

CIAvash commented 2 months ago

I use them for documentation, and also use tools for generating other formats(Markdown for example) from them. Although the tools are not perfect right now and sometimes need manual editing.

An example from one of my modules: Raku: https://codeberg.org/CIAvash/APISports-Football/src/branch/main/lib/APISports/Football.rakumod Markdown: https://codeberg.org/CIAvash/APISports-Football/src/branch/main/README.md

ab5tract commented 2 months ago

@lizmat Can you expand on some the implementation hurdles you have been facing? That might help to scope the discussion.

For example, if #= outside of signatures is causing trouble, I would wholeheartedly endorse removing that option. On the other hand, if it's not a source of problems then it is probably safe to keep.

Without having any context, I also wonder whether it could be that this particular piece of RakuDoc doesn't belong in the RakuDoc processing code at all and should rather be part of the "regular" grammar (where we can also add caveats such that most (or even all, if there are circularity issues) of the RakuDoc syntax is unavailable in the "decks" (nice phrasing @raiph!).

Tangent: I've always felt like .WHY was an under-utilized magic feature for the REPL. In a perfect world, I would love to see everything in CORE.setting have a reasonable response to this method call. It's also .WHY I don't personally see much value in the "undocking" functionality described by @raiph -- the docs are there for the user's benefit, so taking it out of the code on its way to distribution doesn't appeal to me.

lizmat commented 2 months ago

@CIAvash thanks for your example!

But to me they prove exactly why declarator docs are bad. In the ATTRIBUTES section:

has APISports::Football::HTTPClient $.http_client

An object for making requests to api-football.com

What does the APISports::Football::HTTPClient mean to a user of your module? It isn't documented anywhere else. It's implementation detail leaking out, and as such cluttering the user documentation. Same for TwoChars, AtMost2Digits, MatchStatus.

Also, why would a user be interested in how the signature of a method is implemented?

method matches(
    Bool :h(:$http_body),
    *%params (Int :$id where { ... }, Str :$live where { ... }, Date(Any) :$date, Int :$league where { ... }, Int :$season where { ... }, Int :$team where { ... }, Int :$last where { ... }, Int :$next where { ... }, Date(Any) :$from, Date(Any) :$to, Str :$round, MatchStatus(Str) :$status, Str :$timezone where { ... })
) returns Mu

To me, this really feels like documentation for a maintainer, not for a user.

CIAvash commented 2 months ago

Also, why would a user be interested in how the signature of a method is implemented?

That's why I mentioned the tools, they have a lot of room for improvement. Tools should extract the important parts of the code.

What does the APISports::Football::HTTPClient mean to a user of your module?

It probably should link to the documentation of the class, if it is usable by the user.

So I think the problem lies in the tools.

ab5tract commented 2 months ago

Also, why would a user be interested in how the signature of a method is implemented?
method matches(
Bool :h(:$http_body),
*%params (Int :$id where { ... }, Str :$live where { ... }, Date(Any) :$date, Int :$league where { ... }, Int :$season where { ... }, Int :$team where { ... }, Int :$last where { ... }, Int :$next where { ... }, Date(Any) :$from, Date(Any) :$to, Str :$round, MatchStatus(Str) :$status, Str :$timezone where { ... })
) returns Mu
To me, this really feels like documentation for a maintainer, not for a user.

It can be a fuzzy line, to be sure. But considering that, post-RakuAST, the where blocks will be fully documented, I think there is significant value to sharing the signature. Also, is the signature appearing in the documention even related to the declaration syntax?

FWIW, most other from-source-generated documentation I've encountered take a fair amount of space for displaying to the user -- for a module's API, this would be a developer -- what arguments a given routhine will take.

has APISports::Football::HTTPClient $.http_client What does the APISports::Football::HTTPClient mean to a user of your module?

It's a publicly accessibly part of the API -- why would it not be relevant to the user?

lizmat commented 2 months ago

For example, if #= outside of signatures is causing trouble, I would wholeheartedly endorse removing that option. On the other hand, if it's not a source of problems then it is probably safe to keep.

The problem with #= is, is that it needs to be attached to the last declaration. Now, from a grammar point of view, this can be tricky. Because all comments, including declarator docs, are considered to be whitespace internally. For instance:

class  #= foo
  A    #= bar
{      #= baz
    ...
}      #= zippo

Does the declaration start with class? Or after the name? Or after the opening {. Or after the closing }?

In the Raku grammar, only baz will be attached. foo and bar will generate a warning about not being able to find a declarator. The zippo will be silently ignored.

In the legacy grammar these three would all be silently ignored.

Now clearly this is an artificial example. But when you realize that parameters and blocks can have declarator docs:

sub foo(
  Int $a where { $_ > 1 }  #= foo
) { }

In the Raku grammar, the "foo" is attached to the where { } block, not to the parameter. In the legacy grammar, the "foo" isn't attached to anything.

My point: the "last declarator" rule is not very transparant.

FWIW, the "next declarator" isn't either.

#| foo
sub
#| bar
foo(Int $a) { }

The "foo" is attached to the sub, the "bar" is attached to the parameter (both in legacy grammar and Raku grammar).

Again, this is artificially constructed, but I hope it shows that kind of hoops the grammar needs to jump through to get something because all comments are just whitespace.

And you can argue, this is a case of DIHWIDT, but I doubt whether a developer will check whether the documentation they thought they added, is showing up at the right place, or at all.

In other words: it is all too magic.

lizmat commented 2 months ago

@CIAvash

What does the APISports::Football::HTTPClient mean to a user of your module? It probably should link to the documentation of the class, if it is usable by the user. So I think the problem lies in the tools.

How can the tool determine whether something is supposed to be usable by the user? Most developers don't put a my in front of their class definitions, which means it is a publicly visible class. But that doesn't mean it is supposed to be used by itself? So this would require more discipline in the developer to mark these classes as internal. Only then could a tool decide not to mention the type in that argument.

ab5tract commented 2 months ago

Thank you for clarifying. The current rules do indeed seem way too loosey-goosey for even our project's threshold of implementor-torment :)

I doubt we would see any significant regressions in the ecosystem if we scoped #| to only refer to package and routine definitions and for #= to only apply to parameters. Even package "decks" could likely be tossed out without much impact. Any usage outside of definitions would be ignored, ie:

# this does indeed seem like a step too far to me
#| get all the whys
sub 
#| this form of cruelty will be ignored, worry'd, or sorry'd
why {
    { .WHY } #= why??? ... just use a comment!
        for @_;
}

In JavaDoc, they get to add some module-implementor torment by forcing awkward syntax that goes above the routine definition where the signature parameters need to be individually maintained and their types spelled out by hand. It also visually breaks the flow when reading through code.

But when it comes to IDE tooltip integration or just plain using the generated HTML documentation to actually work with what the code provides, it's immensely helpful.

I think the "decks" are a huge step above this and deliver the same functionality with way less maintenance and visual disruption.

CIAvash commented 2 months ago

How can the tool determine whether something is supposed to be usable by the user?

Probably needs to be done using configs? Maybe one config for the whole document and individual configs if some part of the code needs it.

Rustdoc for example uses some configs for hiding documentation and doing other things. More info on Rustdoc

lizmat commented 2 months ago

re:

#| get all the whys
sub 
#| this form of cruelty will be ignored, worry'd, or sorry'd
why {
    { .WHY } #= why??? ... just use a comment!
        for @_;
}

This "this form of cruelty will be ignored, worry'd, or sorry'd" attaches to the block of the why call. Why you may ask? And knowing the Raku grammar a bit, it will be very hard to fix. Because, as I said: it's whitespace. And apart from the fact that in the grammar this whitespace is traversed multiple times (hence a quite elaborate system of handling that declarator doc only once), during the parsing of Raku code, it is quite unclear where things need to be attached to.

lizmat commented 2 months ago

Tangent: I've always felt like .WHY was an under-utilized magic feature for the REPL. In a perfect world, I would love to see everything in CORE.setting have a reasonable response to this method call.

This is a different issue. In a REPL or an IDE, linking from an object to the appropriate documentation, should be a separate project. Putting the docs as declarator docs in the core, would not be a solution.

ab5tract commented 2 months ago

Tangent: I've always felt like .WHY was an under-utilized magic feature for the REPL. In a perfect world, I would love to see everything in CORE.setting have a reasonable response to this method call.

This is a different issue. In a REPL or an IDE, linking from an object to the appropriate documentation, should be a separate project. Putting the docs as declarator docs in the core, would not be a solution.

I think that is a matter for discussion, rather than a settled fact.

EDIT: But it's literally a tangent. Let's not worry about it here or now.

ab5tract commented 2 months ago

re:
#| get all the whys
sub 
#| this form of cruelty will be ignored, worry'd, or sorry'd
why {
    { .WHY } #= why??? ... just use a comment!
        for @_;
}
This "this form of cruelty will be ignored, worry'd, or sorry'd" attaches to the block of the why call. Why you may ask? And knowing the Raku grammar a bit, it will be very hard to fix. Because, as I said: it's whitespace. And apart from the fact that in the grammar this whitespace is traversed multiple times (hence a quite elaborate system of handling that declarator doc only once), during the parsing of Raku code, it is quite unclear where things need to be attached to.

I'm a bit confused, sorry. My proposals were:

A) process "decks" in the grammar differently than other RakuDoc syntax B) the implementation refuses to attach "decks" to anything other than routines (or possible packages)

Maybe A is not possible. But B seems to preclude what you are saying in your response? How would it get attached if the implementation is designed to ignore, complain, or outright die when such an attachment is attempted?

lizmat commented 2 months ago

Getting back to @CIAvash's example:

What I miss in the current RakuDoc, is a simple way to render the signature of a subroutine or a method without any additional comments.

Something like:

sub frobnicate(Int:D frobnicatee, :$hammer) { ... }
...
=begin rakudoc
=head2 :signature<&frobnicate>
The C<foo> subroutine frobnicates its positional argument, possibly hammering it with the C<:hammer> named argument.

that would render to something like (in markdown):

## subroutine "frobnicate"
* required positional argument #1: `Int`
* optional named argument: `hammer`
The `foo` subroutine frobnicates its positional argument, possibly hammering it with the `:hammer` named argument.

lizmat commented 2 months ago

@ab5tract sorry, got confused / distracted.

A) process "decks" in the grammar differently than other RakuDoc syntax

It already does? Because declarator docs can appear at any place in the code where there is whitespace. They can not appear in whitespace inside "docks".

B) the implementation refuses to attach "decks" to anything other than routines (or possible packages)

That would severely limit usefulness, especially when documenting CLI arguments in scripts.

ab5tract commented 2 months ago

@ab5tract sorry, got confused / distracted.

A) process "decks" in the grammar differently than other RakuDoc syntax

It already does? Because declarator docs can appear at any place in the code where there is whitespace. They can not appear in whitespace inside "docks".

For some reason it seems that you have missed that all of my suggestions are around significantly curtailing the appropriate locations for #| and #=.

What I'm proposing for the grammar (and which I appreciate might not be possible) is to not treat these as whitespace. They would specifically be optional captures for routines (#|) and parameters (#=).

B) the implementation refuses to attach "decks" to anything other than routines (or possible packages)

That would severely limit usefulness, especially when documenting CLI arguments in scripts.

Please re-read my earlier message. In item B, I was referring specifically to #|. The behavior of #= would be restricted to refer to parameters only (so no blocks, no random variables in random scopes, nothing besides to the right of a parameter declaration inside of a signature).

Also, I don't see how there is any usefulness in any context under your proposal to remove them entirely?

lizmat commented 2 months ago

I doubt we would see any significant regressions in the ecosystem if we scoped #| to only refer to package and routine definitions and for #= to only apply to parameters.

@ab5tract I assume I missed the meaning of "scoped" here?

ab5tract commented 2 months ago

I doubt we would see any significant regressions in the ecosystem if we scoped #| to only refer to package and routine definitions and for #= to only apply to parameters.

@ab5tract I assume I missed the meaning of "scoped" here?

I meant in the sense of narrowing down, in this case meaning "only define the concept/allow the parser to accept in this narrowed conception of #| and #=".

lizmat commented 2 months ago

@ab5tract OK, gotcha now.

Maybe a first step could be simpler:

#| is only allowed at the start of a line
#= is only allowed when it is not at the start of a line
extended forms #|( ... ) and #=( ... ) to be disallowed

finanalyst commented 2 months ago

Sorry for the delay in responding. Just seen this thread. Working from phone, so I hope comment correctly attached to thread.

Rather than respond to direct questions, here are some considerations:

The intimate connection between Rakudoc and code means Rakudoc can only be handled by Raku and a BEGIN expression may be run. For this reason, hosts like GitHub and raku.land will not generate HTML on the fly from RakuDoc.
RakuDoc V2 explicitly distinguishes between elements for IDEs and elements for documentation. Decks (using abstract 's term - sorry for autocorrect) are part of the IDE division. However, RakuDoc also explicitly states that semantic blocks should be available for other tools, and they can be 'moved' around (eg. Declared at the beginning of a source but rendered later.)
One of the meta aims of RakuDoc IMHO seems to have been to provide a mechanism to handle many of the suggestions of literate programming. This implies a close connection between the code and the documentation it concerns. I have not seen any development of this idea though.
I have used #= and #| ALOT!!! Since I use Comma and Comma pops up the explanation of a variable to which #| is attached, I find this quite useful as I develop a distribution. But that is a use of RakuDoc inside an IDE.
I find the ability to attach #= to variables inside a sub MAIN very!!!!! useful. I'll forget the parameters for CLI and they are automatically available.
But I am frustrated by the limitations of #| . An important structure for me is a hash and I use them in config situations. I cannot attach a #| to a key. I would like to generate some documentation that extracts comments about keys of a hash, because I comment new keys, but can't remember them all.
while =finish has not been mentioned yet, it is also a part of RakuDoc and is code oriented rather than documentation oriented. I have found =finish to be useful particularly in tests, where sample input can be placed.

Some questions.

Can we separate completely everything inside a =rakudoc block so that it is available before any bytecode has been generated?
this will mean that we will have to remove A<> markup from being able to provide the value of a Raku variable. This is a part of the spec of both POD6 and RakuDoc V2 but no one has ever used it.
If a deck is used, then its values are only available after bytecode has been created, and so would not be available for a renderer.

Please ask questions if this is not clear and I'll respond when I get back online later today.

Richard

On Sun, 15 Sept 2024, 15:47 Elizabeth Mattijsen, @.***> wrote:

@ab5tract https://github.com/ab5tract OK, gotcha now.

Maybe a first step could be simpler:

| is only allowed at the start of a line

= is only allowed when it it not at the start of a line

extended forms #|( ... ) and #=( ... ) to be disallowed

— Reply to this email directly, view it on GitHub https://github.com/Raku/problem-solving/issues/438#issuecomment-2351623849, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACYZHDQLASHIWFTB2FZE4TZWWMX5AVCNFSM6AAAAABOFKPHUCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJRGYZDGOBUHE . You are receiving this because you were mentioned.Message ID: @.***>

thoughtstream commented 2 months ago

There have been a lot of valid and useful points made on both sides of the argument. It seems clear to me that:

Users do want to be able to annotate elements of their code in a way that is subsequently accessible to both that code and to other tools (such as an IDE).
Using special forms of a comment to do that was probably a mistake, given that comments are treated as just an unusual kind of whitespace by the compiler.
We may be neglecting a better, already available mechanism that might solve both these problems.

Because it occurs to me that Raku already has a mechanism for associating information with a value or a declared object: traits.

So, instead of specifying that two particular kinds of comment have special meaning in a limited range of contexts, why would we not just specify that there is a docs trait (or perhaps why or nb or desc):

    class Magician docs 'Base class for magicians' {
        has Int $.level;
        has Str @.spells;
    }

    sub duel docs ('Fight mechanics', 'Magicians only, no mortals') (
        Magician $a  docs 'The first magician in the duel',
        Magician $b  docs 'The second magician in the duel',
    ) {
        ...
    }

    my $mage  docs 'A magician of level 2 or above';

    say Magician.WHY;       # OUTPUT: «Base class for magicians»
    say &duel.WHY.leading;  # OUTPUT: «Fight mechanics»
    say &duel.WHY.trailing; # OUTPUT: «Magicians only, no mortals.»

And perhaps even a docs operator to cater to @finanalyst’s desire for annotated hash keys:

    my %config =
        'size'  docs 'Max size of entry'  =>  42,
        'limit' docs 'Apply limiting'     =>  True,
        'rand'  docs 'Randomizes lookup'  =>  True,
        'etc'   docs 'Et cetera'          =>  'et cetera';

It’s not as convenient to the coder as #| and #= declarator blocks, but it would probably be a heck of a lot easier to implement.

raiph commented 2 months ago

@thoughtstream

I haven't thought through the impact on ergonomics, nor how the discussion will move forward if we start arguing about it (please everyone, think carefully before saying "No!" or worse), but for this comment I'll focus on presuming we run with your great use-a-trait (and maybe an infix) ideas.

First, for various reasons (enhancing cognition through familiarity, making it easy to maintain highlighters that distinguish traits, etc) the pattern for "traits" (in Raku) is to use one of a small fixed list of auxiliary (aka "helping") verbs (is, does, will, ...).

So I think we ought pick one of those. Or we could add another general helping verb for 6.e rather than one specific to doc. I'll be "provocative" and suggest one: has doc('foo bar'). That, er, has pros and cons of course (will it lead to poor error messages if a newbie prefixes a has they that meant would declare an attribute in a class? etc). But I'm wanting Rakoons to think about such things before deciding on which trait verb. For example, I've long thought that is built was a slightly misworded trait -- it could have been, arguably would have better been, will build -- and want us to think about that aspect.

niner commented 2 months ago

While traits are well integrated into the language and would be perfect for attaching meta data, we'd lose one big POD advantage: the formatting codes. Traits can't take POD, can they?

lizmat commented 2 months ago

@niner traits can take text with formatting codes. In fact, there's the RakuAST::Doc::Paragraph.from-string method that takes any string and turns it into a RakuAST::Doc::Paragraph object if there was any markup, or returns the string unaltered if no markup was found.

patrickbkr commented 2 months ago

I want to put a word in for being able to add documentation to source code elements.

I personally am a big fan of JavaDoc (e.g. the Apache POI docs), or Doxygen or similar tools to help document APIs. With APIs the users are developers. APIs usually are made up of a set of classes / modules, functions and variables. Those elements form a hierarchy. Organizing the documentation around that same hierarchy makes a lot of sense. Function signatures are a pretty good representation of the contract of that function. Often more explanation is needed, but the signature alone goes a long way. So having the literal function signatures visible in the documentation is nice. I'm not even talking about IDE support, but about some standalone navigable documentation.

I also like good free text module documentation (e.g. DBIX::Class). A well written free text documentation is often a better read than documentation forced into the JavaDoc format (where all documentation is forced into the class / method hierarchy).

But having both well integrated available in a language is pretty epic IMHO.

A different matter:

Toolchain wise we've long had the issue that tools to display documentation (GitHub, GitLab, raku.land, ...) have an issue with running the Rakudo parser on source code to reach for the embedded docs as that code can contain BEGIN blocks, which is a big safety concern.

That's why I had the idea to implement a tool that extracts and converts all docs in a source file and outputs a clean .rakudoc file. Then the unsafe bits of the processing are well encapsulated and we can have a safe parser (that can only process .rakudoc files). We can use the safe parser on GitHub and the likes to display README.rakudoc files. On raku.land, where we want to render the full documentation we can run the extractor in a sandbox. (This idea and my exploration of how to do this kicked of the discussion culminating in this issue.)

Being able to extract documentation from some Raku source file without using a full Raku parser would be preferable to the above. Doing so for Decks seems a lot more realistic than for the trait based approach. (I just realize that with my recent Raku lexer, I might actually be pretty close to a tool that - on a best effort basis - can extract the RakuDoc bits from some Raku source file.)

I withdraw the above paragraph. While it might be possible to extract the declarator blocks themself from some Raku code, extracting the referenced code parts and putting them into a usable hierarchy (classes / modules > methods / subs > variables) is basically impossible.

FCO commented 2 months ago

Just answering about how used it is: I use it a lot! Red even uses it to add comments in tables, columns and queries

FCO commented 2 months ago

https://github.com/FCO/Red/blob/7ab1fc52b271713b5e2aa49cd9f2ab54c4589ce4/lib/MetamodelX/Red/Model.rakumod#L434

lizmat commented 2 months ago

@FCO: feels to me you're only using #|, is that correct?

patrickbkr commented 2 months ago

I'd like to point out, that at the moment there is no widespread tool support for Decks (or RakuDoc in general) available. So if we decide to make big changes (like adding a docs trait and removing #= and #|), we should do so Pretty Soon™️. I'd really like to push the tooling forward so we can finally enable and incentivize our fellow Rakunistarinas (Was that the consensus on how we want to call ourselves?) to write module documentation.

(@CIAvash and probably others: I recognize you use Decks and even built your own tooling to utilize it. I do feel sad and sorry that such big changes would hit you pretty hard.)

CIAvash commented 2 months ago

@thoughtstream already mentioned convenience, I add readability as well.

But what if the documentation is multiline(although Rakudo currently ignores newlines)? With traits how would multiline doc work? Using heredoc?

lizmat commented 2 months ago

FWIW, in RakuDoc, the newlines are available to any renderer.

tbrowder commented 2 months ago

I want to be able to use them. Early on I noticed in the grammar the declarator blocks before routines were getting transformed into a single string. Some years ago I asked Damian if the intent was to do that OR leave as formatted by the author. He said that was the intent (that discussion is documented in the repo somewhere).

I started working on an option to do that but I wasn't fast enough to keep up with the main line.

My thought though was that we could have both an easily-defined text structure for either the author or the source for more formal system tools for documentation as has been mentioned above.

I do like the suggestion to limit the tags to #| and #= with one caveat: it would be handy to define (using current rules) one pair of delimiters for multi-line text declarators.

tbrowder commented 2 months ago

I found the pertinent docs:

github.com/rakudo/rakudo/docs/S26-configure-notes-from-Damian-Conway.md
github.com/rakudo/rakudo/docs/S26-declarator-notes-from-Damian-Conway.md        
github.com/rakudo/rakudo/docs/declarator_block_expectations.raku

alabamenhu commented 2 months ago

I actually do regularly use them in my modules, although sometimes I'm not as consistent in using them as I'd like to be. My personal style is generally to use #| for classes and subs, and #= for attributes and parameters. I normally go back and add them as I update, particularly for things that are more end-user centric (whereas more internal-to-module things I am more likely to forget). I did this both in anticipation of improved tooling for doc generation and IDE presentation. For example

#| A module to enable time zones in Raku
unit module Timezones;
use Timezones::ZoneInfo;

# Thanks to lizmat++ for this cool way to extend built-ins
class DateTime is DateTime is export {
    has Str  $.olson-id; #= The unique tag that identifies the timezone
    has Str  $.tz-abbr;  #= A mostly unique abbreviation for the time zone that aligns more closely to popular usage
    has Bool $.is-dst;   #= Whether it is daylight savings time

    #| Creates a new timezone-aware DateTime object
    method new(|c (*@, :$olson-id, :$timezone, *%)) { ... }

    #| The timezone as either offset or name (IntStr)
    method timezone { IntStr.new: self.offset, self.olson-id }
}

I think the vast majority of uses for people line up with this structure. Perhaps the grammar could be simplified by saying the following: #| must precede a statement, and #= must follow a statement, and disallow at any other boundary. They will attach to the first (or all) declarable in the event of declaring multiple things in a line (e.g. my $a, $b). That would account for probably 97% of real world and realistic potential uses. To hit 99.9% we'd allow #= as the first statement in a block to attach accordingly (allowing attachment to an embedded routine, e.g. my $foo = sub bar { ... })

I really like them as an idea because I've hated how other styles require a gigantic friggin block above a routine (which then separates a bit the documentation from the code and lends itself to desyncing IMO), but I agree that the current form they are a bit too.... underdeveloped and have lots of potential gotchas and complexities and if not undefined behavior, severely underdocumented behavior that could trip people up potentially (whereas lizmat admits her examples are artificial, the fact that they are handled to any degree and not even I who use them regularly would have accurate predicted them says the behavior needs cleaned up.

FCO commented 2 months ago

@lizmat I usually use both #| and #= (maybe not on that file). But just as an example, with Red, if you do:

#| A table comment
model MyTable {
   has Int $.id is column; #= a column comment
}

Those comments go to the database when you ask to create that table...

jubilatious1 commented 2 months ago

Still trying to wrap my head around this Issue.

The best I can do is analogize. Because the docs say that Declarator Blocks "...can extend multiple blocks" they sound like RMarkdown chunk options, which provide over 50 settable parameters for literate programming output:

https://bookdown.org/yihui/rmarkdown-cookbook/chunk-options.html
https://yihui.org/knitr/options/
https://stackoverflow.com/questions/32634274/knit-hooksset-and-opts-chunkset
https://rmarkdown.rstudio.com/docs/reference/knitr_options.html

I understand I keep mentioning R/RMarkdown but this is the easiest way for me to learn Raku/RakuDocs. Here's a visual to help anyone following along so far:

https://rmarkdown.rstudio.com/lesson-3.html

Above you can see on the left panel the first code block (used to setup output) sets the parameter echo = FALSE which means the code within the block doesn't appear in the output (see output panel on right side). Chunk options can be very intricate:

#```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE, tidy=TRUE, tidy.opts=list(width.cutoff=60))
#```

Anyway, it seems to me that Declarator Blocks have great use in RakuDocs (if they parallel chunk options in RMarkdown), and therefore are not evil and should not be eradicated.

@thoughtstream

lizmat commented 2 months ago

and therefore are not evil and should not be eradicated

It has become clear to me that eradication will be impossible, at least in my lifetime :-)

Which in a way is good, because it means that all the time I've invested in RakuAST to get declarator docs working so far. I've therefore decided to change the title of this issue.

jubilatious1 commented 2 months ago

lizmat changed the title Declarator Docs are evil and should be eradicated Declarator Docs should be limited in scope

No chance of changing the title to "Declarator Docs should mimic functionality of RMarkdown "Chunk options" ...?

😀

patrickbkr commented 2 months ago

Given the discussion has progressed to us needing to reduce the scope of Decks, I'd like to bring up thoughtstreams idea of using traits again.

If we reduce the places where Decks can appear and what they can reference, they are from a users perspective pretty similar to the traits proposal. Compare

#| Base class for magicians
class Magician {
    has Int $.level;
    has Str @.spells;
}

#| Fight mechanics
#| Magicians only, no mortals
sub duel (
    Magician $a, #= The first magician in the duel
    Magician $b, #= The second magician in the duel
) { ... }

to

class Magician docs 'Base class for magicians' {
    has Int $.level;
    has Str @.spells;
}

sub duel docs ('Fight mechanics', 'Magicians only, no mortals') (
    Magician $a  docs 'The first magician in the duel',
    Magician $b  docs 'The second magician in the duel',
) { ... }

So if the comment based Decks are really such a pain to implement, I personally would be fine to just move over to the trait approach.

We do need to evaluate the implications this has on existing code and toolchains though:

From the above discussion it's evident there is code out there using #| and #=. That code would need to be adapted (though unmodified code wouldn't break, just degrade Decks to plain comments).
Comma would need to change.
Toolchains that rely on .WHY will mostly keep functioning unmodified.

Did I miss anything? Is it worth it?

lizmat commented 2 months ago

So if the comment based Decks are really such a pain to implement

To be clear: decks have been implemented in RakuAST well enough to pass the tests in roast. It's just that implementing safe rakudoc exposed a number of nits / inconsistencies to me that would need further investigation. And remembering the pain it took to get decks to where they are now in RakuAST, I wanted to be sure that it would be worth the effort.

thoughtstream commented 2 months ago

It occurs to me that one possible solution to preserving both declarator blocks and @lizmat's sanity (;-) would be to rethink what declarator blocks actually are.

What if #| and #= were not comments at all? What if they were actually an optional component (a "Deck"?) of various declarative constructs?

Then we could add them to the syntax for those constructs in the restricted locations that @lizmat is hoping for. Something like:

rule package-decl { <leading-deck>? [package | class | grammar] <package-name> <package-adverbs> <package-traits> <trailing-deck>? <block> } rule parameter { <parameter-type> <parameter-sigil><parameter-name> <parameter-traits> <parameter-default> <trailing-deck>? <comma> }

The trick would be to redefine the rule for comments so that the '#' introducer token becomes '#' <!before '|' | '='>, which would mean that leading and trailing decks are no longer skipped as being whitespace, and can only appear in the places where the Raku grammar explicitly allows them.

finanalyst commented 2 months ago

Assuming the RakuAst implementation that now works and is safe, how should the Rakudoc spec be clarified? What is not possible?

On Wed, 18 Sept 2024, 11:27 Elizabeth Mattijsen, @.***> wrote:

So if the comment based Decks are really such a pain to implement

To be clear: decks have been implemented in RakuAST well enough to pass the tests in roast. It's just that implementing safe rakudoc exposed a number of nits / inconsistencies to me that would need further investigation. And remembering the pain it took to get decks to where they are now in RakuAST, I wanted to be sure that it would be worth the effort.

— Reply to this email directly, view it on GitHub https://github.com/Raku/problem-solving/issues/438#issuecomment-2358100331, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACYZHAZ56PMQQGYWBGD3TLZXFISTAVCNFSM6AAAAABOFKPHUCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJYGEYDAMZTGE . You are receiving this because you were mentioned.Message ID: @.***>

Raku / problem-solving

Declarator Docs should be limited in scope #438

| is only allowed at the start of a line

= is only allowed when it it not at the start of a line