Auto-generate manpage, help docs, etc.

joshtriplett commented 8 years ago

Maintainer notes:

Blocked on https://github.com/clap-rs/clap/issues/2914 for decoupling help information gathering from formatting
help2man can be a source of inspiration for how to integrate this into a users process

I'd love to have support to generate a manpage. This would use a mechanism and infrastructure similar to #376. Additional functions to override or augment portions of the generated manpage could come later, but I think with relatively few additions this could become an incredibly useful mechanism.

The manpage title should default to the bin_name value.
The section should default to 1.
The NAME section should default to bin_name \- about, where about is the string set by .about.
The SYNOPSIS section should contain the usage strings for the command and every subcommand.
The DESCRIPTION section would need some new paragraph-style information provided (also usable as a more structured .before_help).
The "OPTIONS" section should document the flags and args for the top-level command.
If the command has subcommands, a "SUBCOMMANDS" section should document each subcommand in a sub-section.
The AUTHORS section should contain the author information, if provided.
The SEE ALSO section would need some new mechanism to populate it.

I'd be happy to help with manpage markup, once I see the details of the mechanism used in #376.

ssokolow commented 4 years ago

I believe this should be fixed in the current master version of Clap, since Clap will now output non-colored text when ANSII escape codes aren't supported. #963

You misunderstand. I'm saying that clap is pretty much the only thing that matches man for making good use of slightly rich markup to make the text more scannable and only if it's a program that opts into using coloured output.

Here's one of my own programs using Python's argparse for --help generation (I'm in the middle of modernizing it. A few days ago, it was using optparse):

Screenshot_20200123_135126

...and here's the result of using Sphinx's third-party .. autoprogram:: directive, then outputting the resulting page via Sphinx's manpage renderer without having added any supplementary sections below it yet:

Screenshot_20200123_135542

(It'll take some reworking to get Sphinx generating both a manpage and an HTML manual without either having to write and maintain two slightly different versions of the same content or having at least one of them looking terrible.)

In case you're wondering, the colourization is accomplished by this shell function:

# Launch man with modified `less` termcap in subshell to colourize it
man() { (
    export LESS_TERMCAP_mb=$(tput bold; tput setaf 2)
    export LESS_TERMCAP_md=$(tput setaf 6)
    export LESS_TERMCAP_me=$(tput sgr0)
    export LESS_TERMCAP_so=$(tput setaf 7; tput setab 4)
    export LESS_TERMCAP_se=$(tput rmso; tput sgr0)
    export LESS_TERMCAP_us=$(tput smul; tput setaf 7)
    export LESS_TERMCAP_ue=$(tput rmul; tput sgr0)
    export LESS_TERMCAP_mr=$(tput rev)
    export LESS_TERMCAP_mh=$(tput dim)
    export LESS_TERMCAP_ZN=$(tput ssubm)
    export LESS_TERMCAP_ZV=$(tput rsubm)
    export LESS_TERMCAP_ZO=$(tput ssupm)
    export LESS_TERMCAP_ZW=$(tput rsupm)
    export GROFF_NO_SGR=1         # For Konsole and Gnome-terminal
    command man "$@"
) }

codesections commented 4 years ago

I did misunderstand. Thanks for the clarification :+1:

pickfire commented 4 years ago

The funny thing is that the man pages are exactly "the options from --help in a slightly different format" with some sort of header and a footer.

Most man pages are more elaborated as compared to --help page. Most of the time I try man page first because it is easier, just pressing alt-h when I typed the command.

casey commented 4 years ago

I just wanted to jump in and +1 the idea of producing output in an intermediate format, which can then be fed in to various back ends.

Since this output will be produced and consumed by code, JSON seems like a good choice, since it's simple and universally supported. TOML is also simple, but deep nesting in TOML gets weird.

This would also make the initial implementation very simple, just output some JSON describing the CLI, and then backends could come later. Additionally, it would allow backends to be fully decoupled from clap.

My own use case is that I'd like to generate both roff, to display with man, but also generate markdown, for inclusion in an mdbook book. Both formats aren't so complicated, so I wouldn't mind writing my own JSON to roff and JSON to markdown backends.

pickfire commented 4 years ago

@pksunkara I would like to help out with man page generation, maybe I will try generating mdoc format first then troff since I like that format.

pksunkara commented 4 years ago

IIRC @codesections is working on it.

pickfire commented 4 years ago

@pksunkara I believe I am working on a different man page format here which most likely @codesections is not using which is mdoc.

spacekookie commented 4 years ago

@pickfire wrote:

pksunkara I would like to help out with man page generation, maybe I will try generating mandoc format first then troff since I like that format.

I think what is important to consider is that man page generation only partially has to be integrated with clap. A man page isn't just a set of options with the same help text as --help (well, some of them are and they're utterly useless!)

Instead, a man page need to be more long-form text that is merged with the structure of the clap arguments, with some sections just being free-form explanation of the rationale of the tool.

@codesections was working on that a few months ago, I don't know if he's made any progress on it though. I've since written a crate to introduce a (work in progress) on-disk format/ structure to handle text assets in Rust crates. I did this also with the goal of making all strings easily translatable (something I'd love to integrate with clap!)

Maybe you wanna have a look to see how you can make mandoc (or troff), my crate (traduki1) and clap work well together. I'd also be happy to help out with this. There's so many many ways to be lazy about this and get it wrong, I very much want to help get it right!

~k

pickfire commented 4 years ago

@spacekookie I understand since man pages will usually have more items not even in help. Thanks for sharing traduki. But from my personal point of view, I prefer writing man pages either mdoc or troff by hand at the expense of duplicated help. Also, I don't quite like using yaml and I also think building all the man pages is better than building just one.

Still, one could use the man page generated as the base and modify it later.

Dylan-DPC-zz commented 4 years ago

I think for something that is a bit of complex, It's better that it is done separately as a separate crate so that we can iterate and find the better solution. I remember as Katherina said, that codesections is working on something so I'd wait on that to decide where this goes after that.

spacekookie commented 4 years ago

Also, I don't quite like using yaml and I also think building all the man pages is better than building just one.

@pickfire sorry, I'm not sure I understand what you mean by "building all the man pages" here. Could you clarify that a bit maybe?

Also regarding traduki: I picked yaml as a format because it's flexible enough to do all the things without much syntactic overhead. I'm not opposed to add more format support, if there's something you're more comfortable with.

ssokolow commented 4 years ago

@spacekookie I reiterate my earlier opposition to YAML that got buried in the fold.

(I wish GitHub had a way to turn that off.)

pickfire commented 4 years ago

@joshtriplett While trying out some tests cases in mdoc, I believe we should have multple manpages instead of sticking everything into one man pages. I think it would be better to split each subcommands into their own manpages instead since subcommands usually have their own flags, options and description.

The naming of other files could be myapp-subcommand and we could have a SEE ALSO section at the bottom to link them. But still, we could keep the subcommand and their options at the main SYNOPSIS, in each subcommand man pages, they could have their own SYNOPSIS.

While trying out, I figured out that mdoc have no good default support for GNU-style --long help. I also figured out it would be interesting if we can find env!(), option_env!() or related calls and document them under ENVIRONENT section. It would also be nice to show some examples in the EXAMPLES section. We could also add the version at operating system part, bottom left.

This requires changes in clap_generate since currently the generator only expects a single file change. With this, fn generate(app: &App, buf: &mut dyn Write) { is not possible since it may write to multiple files.

I am still thinking of how to put long commands into SYNOPSIS section, or try to use short flag instead of long flag by default. An implementation note, I am substituting myapp (application name) in the description to get highlighting. So far, what I get:

Rust code

```rust App::new(s) .author("John Doe :Jane Doe :anonymous") .about("Tests completions") .long_about(&format!("The quick brown fox jumps over the lazy dog. {} is an application with super cow powers.", s)) .before_help("Send help!") .arg(Arg::new("file").about("Some input file")) .subcommand( App::new("test").about("tests things").arg( Arg::new("case") .long("case") .takes_value(true) .about("the case to test"), ), ) ```

2020-05-26-175048_564x340_scrot 2020-05-26-175102_564x340_scrot

mdoc.1

```mdoc .\" Generated by clap .Dd $Mdocdate$ .Dt MYAPP 1 .Sh NAME .Nm myapp .Nd Tests completions .Sh SYNOPSIS .Nm .Op Fl h .Op Fl V .Op Ar file .Sh DESCRIPTION Send help! The quick brown fox jumps over the lazy dog. .Nm is an application with super cow powers. .Sh OPTIONS .Bl -tag -width Ds .It Fl h , Fl -help Prints help information .It Fl V , Fl -version Prints version information .It Ar file Some input file .El .Sh AUTHORS .An John Doe Aq Mt john@example.org .An Jane Doe Aq Mt jane@example.org .An anonymous .Sh SEE ALSO .Xr myapp-help 1 , .Xr myapp-test 1 ```

2020-05-26-175114_564x340_scrot

mdoc-help.1

```mdoc .\" Generated by clap .Dd $Mdocdate$ .Dt MYAPP-HELP 1 .Sh NAME .Nm myapp-help .Nd Prints this message or the help of the given subcommand(s) .Sh SYNOPSIS .Nm myapp help .Ar subcommands ... .Sh AUTHORS .An John Doe Aq Mt john@example.org .An Jane Doe Aq Mt jane@example.org .An anonymous .Sh SEE ALSO .Xr myapp 1 , .Xr myapp-test 1 ```

2020-05-26-175125_564x340_scrot

mdoc-test.1

```mdoc .\" Generated by clap .Dd $Mdocdate$ .Dt MYAPP-TEST 1 .Sh NAME .Nm myapp-test .Nd tests things .Sh SYNOPSIS .Nm myapp test .Op Fl h .Op Fl V .Op Fl -case Ar case .Sh OPTIONS .Bl -tag -width Ds .It Fl h , Fl -help Prints help information .It Fl V , Fl -version Prints version information .It Fl -case Ar case the case to test .Sh AUTHORS .An John Doe Aq Mt john@example.org .An Jane Doe Aq Mt jane@example.org .An anonymous .Sh SEE ALSO .Xr myapp 1 , .Xr myapp-help 1 ```

Side note, I just realized mdoc have BSD General Commands Manual at the top. fish-manpage-completions also does not support generating completions for mdoc format yet. So groff will still be recommended. But I like mdoc since it have more advanced and specific macros. mdoc and groff would crash since the filename is the same but BSD might be better off generating the mdoc version.

@spacekookie Regarding traduki, I prefer not to write additional docs but embed it into clap options itself, I personally think yaml would be the last choice of additional docs that I can think of. I think internationalization should be tackled in another issue and at the same time, I believe we should use something mature to do internationalization such as gettext or project fluent.

@ssokolow While reading your old screenshots, I noticed that the sections and stuff are not well indented, I could help out with those markup and stuff if you want. I could also help out with the groff if you want me to, but still I would like to try out mdoc first. Maybe we can share test code.

What do you all think of separating it into multiple files?

spacekookie commented 4 years ago

@ssokolow :

(I wish GitHub had a way to turn that off.)

Yea me too.

& @pickfire: regarding yaml, it was the first format I picked because I personally find it easy to work with. But I know that many people have different preferences and that's totally fine. I am in no way opposed to writing other parser backends. json, toml, MO, some web tool, whatever...

Regarding the structure of additional documentation: the problem is that tools that don't use external asset tools are usually not translated, or if they are only to a few select languages that become a big maintenance burden. Furthermore, man pages or GNU info mages require more data than clap. The title description for a command might be the same, but do you really want to embed long-form text into your Rust code, that you might as well want to share in different places?

As to yaml, I really don't understand why you're getting hung up on it. I think I've been nothing but clear on the fact that the current yaml backend is a proof of concept and that more formats should be added.

As for translations, I feel strongly that the solution we integrate with remains as slim as possible. There are a lot of people who only want a simple approach to deal with assets and that don't want to learn how to setup and use something like fluent.

When it comes to the issue of separating pages, I think it's the right choice to do this. This is a standard across many tools, and results in shorter and more manageable pages for users. (also man tool-sub-command is much easier than man tool, then having to search around for sub-command.

pickfire commented 4 years ago

@spacekookie I believe using yaml is not slim and may introduce a lot of old missing translation. If they would not want fluent, they could always fallback to gettext which is widely used from what I see.

Yes, when separating man pages into tool-subcommand, fish is able to detect the man page when the user press a-h to display the correct man page.

alerque commented 4 years ago

YAML is not a terrible option for describing a bunch of meta data. I'm aware of its shortcomings, but it is also flexible and powerful and pretty easy to use (if you're not the programmer working on the edge cases). On the other hand it's a terrible way to represent translation data.

If the main use of YAML was to describe the interface I'd be all for it, but if the main use case is providing alternate localization of the same interface it should be scrapped on favor of just using one of the other app declaration methods (I like the #derive macros, but this should go for the other methods too) and adding hooks into Fluent to load the user facing strings.

spacekookie commented 4 years ago

@pickfire can you please stop trying to derail the conversation with talk about yaml? I get it. We all get it. It's secondary to the actual point and at this point honestly off-topic.

The point of having a central translations crate was exactly not to pick favourites. You have your opinions on how to handle assets, others have theirs. I think it's not out of scope for a language ecosystem to have a system for handling translation assets, that isn't reliant on another system, with the option of hooking into other libraries.

I'd point you to the mail archive where we had a pretty long conversation about this but r0tty's mail archive seems to be offline right now.

I don't really feel like talking about this for ever and ever. I'm gonna start working on a proof of concept integration into clap next week and any changes people want to make to traduki will probably be merged. Add all the backends and formats. But I think it's a sensible approach to have one crate you need to patch to implement additional backends for (or switch backends without breaking your project).

pksunkara commented 4 years ago

The consensus in clap was that we export the doc strings you give us into whatever format the generator needs.

ssokolow commented 4 years ago

@spacekookie Normally, I would, but I just realized that I don't remember an explicit mention of the intersection of these three facts:

We're talking about translations.
One of YAML's known flaws is that its "do what I mean" approach to strings and quoting makes it very easy for someone to intend Norway's ISO 3166 code (no) or Norwegian's ISO 639-1 code (also no) but get a boolean false.
Rust encourages an ecosystem where footguns should be minimized.

Now that it's on the record, I can stop talking about YAML.

pickfire commented 4 years ago

@spacekookie I am not trying to derail the conversation talking about YAML (in fact this came secondary to me). Of course we can switch to any other format but I would say using any serialization format would probably be bad against battle tested translations such as gettext and fluent. I also did mentioned the other facts like internationalization should be something out of scope for this discussion, using traduki would leave fluent and gettext out of the table, that is one main point.

But yes, if we have a way to hook traduki in using the existing docstrings method without having the users doing much work such as maintaining multiple saparate like what traduki is now I think that would be helpful. Of course it may be useful to maintain some separate document which is useful for translations but I believe that should be another issue.

I think this issue should target english only man pages as the default as of now (i18n can be done later) to maintain focus, mainly by using docstrings and not having to do additional steps to auto-generate man pages. I believe the best would be that the maintainer can add just one line of code in build.rs to generate the man pages without any additional efforts, that would be the best.

The reason why I think it would be best for internationalization to be done later is because the rust team planned to integrate it, I don't recall how. Maybe @Manishearth would know more about that, hopefully he can give some insights on how that will relate to this project.

@ssokolow By the way, are you still working on this?

spacekookie commented 4 years ago

@pickfire

I believe the best would be that the maintainer can add just one line of code in build.rs to generate the man pages without any additional efforts, that would be the best.

But that's not how man pages work! Have you ever looked at a man page? What, do you want to embed pages and pages of documentation into your Rust code? Because Rust is so famous for being amazing to write multi-line strings in. Give me a break.

If we go down the route of "just add this one line" Rust applications are going to have man pages that are utterly and completely useless. And I just think we shouldn't aim so low and do better. The design of the tools people use encourage and discourage behaviour. If we go down this route, if we settle for ease of use to the developer in favour of usefulness for the user, we might as well just not bother at all.

ssokolow commented 4 years ago

Not necessarily.

If we were to follow the approach of GNU help2man, then the "this one line" would take an optional include argument that would be appended below the auto-generated stuff as part of the manpage build process... and that does feel very structopt-ish.

help2man enforces a standard ordering for the conventional manpage sections, but you can specify content for each one, as well as create your own.

Including Additional Text in the Output

Additional static text may be included in the generated manual page by using the ‘--include’ and ‘--opt-include’ options (see Invoking help2man). While these files can be named anything, for consistency we suggest to use the extension .h2m for help2man include files.

The format for files included with these option is simple:
 [section]
 text

 /pattern/
 text
Blocks of verbatim *roff text are inserted into the output either at the start of the given ‘[section]’ (case insensitive), or after a paragraph matching ‘/pattern/’.

Patterns use the Perl regular expression syntax and may be followed by the ‘i’, ‘s’ or ‘m’ modifiers (see perlre(1))

Lines before the first section or pattern which begin with ‘-’ are processed as options. Anything else is silently ignored and may be used for comments, RCS keywords and the like.

The section output order (for those included) is:
 NAME
 SYNOPSIS
 DESCRIPTION
 OPTIONS
 ENVIRONMENT
 FILES
 EXAMPLES
 other
 AUTHOR
 REPORTING BUGS
 COPYRIGHT
 SEE ALSO
Any ‘[name]’ or ‘[synopsis]’ sections appearing in the include file will replace what would have automatically been produced (although you can still override the former with ‘--name’ if required).

Other sections are prepended to the automatically produced output for the standard sections given above, or included at other (above) in the order they were encountered in the include file.

I could see such a thing also allowing for easy generation of either a single manpage or one for each subcommand. One top-level "make a manpage" attribute/call? Generate one manpage. Hang one off a subcommand's definition, that subcommand gets broken out of the main manpage into its own manpage.

CreepySkeleton commented 4 years ago

Moderator note: Please keep in mind that this discussion is about manpages generation in clap, not file formats. I'm answering certain yaml criticism below because it's quite objective but misleading, but please, try to stay on topic. . Please also keep in mind that traduki has it's own bug tracker. If you think there's something to do with traduki, like supporting multiple formats, move your discussion there. This issue is unnavigatable already.

Click to expand

> **Important:** YAML spec didn't drop on us from on high, it was being developed iteratively, hence [multiple versions](https://yaml.org/spec/) exist. For the purposes of this discussion, we are only interested in `1.1` and `1.2`. It is important to keep in mind that the `yaml-rust` crate supports only `1.2` to the best of my knowledge. User @ssokolow points out that YAML has a number of pretty serious flaws that considerably affect user experience and may inflict harm in certain circumstances, but it turns out that the flaws are either being `eval`-class bugs in certain libraries and have nothing to do with YAML itself, or exist solely in yaml `1.1`, and thus `yaml-rust` is unaffected. The said flaws: 1. *YAML treats all of `y`, `n`, `true`, `false`, `yes`, `no`, `off`, `on` (along with their capitalized and uppercase forms) as boolean values rather that strings. This is very confusing and clashes with `no` being also the country code for Norway.* [That is true](https://yaml.org/type/bool.html). For YAML 1.1. YAML 1.2 [restricts](https://yaml.org/spec/1.2/spec.html#tag/repository/bool) boolean to either `true` or `false`. `yaml-rust` is unaffected by this. 2. *Many YAML implementation has been known for being vulnerable to [code injection](https://en.wikipedia.org/wiki/Code_injection) attacks*. Yes, this is true, and [sometimes it led to awful consequences](http://tenderlovemaking.com/2013/02/06/yaml-f7u12.html). But then again, this happened because certain libs in many interpreted languages used `eval` as a shortcut for building runtime objects, and that allowed malicious user to execute arbitrary code at runtime. `yaml-rust` does no such thing because Rust, being compiled language, doesn't have `eval("Rust_code")` function. `yaml-rust` is unaffected by this (unless you explicitly write code doing that). 3. *YAML spec is very very huge and implementations seem to unable to implement it properly. Meh, they seem to unable to agree even [among themselves](https://github.com/cblp/yaml-sucks)!* Yes, and that is very true. It looks like there's only one YAML parser for Rust, [`yaml-rust`](https://github.com/chyh1990/yaml-rust), and it's not fully spec-compatible. I'm not touching such things as "42 is a number and not string by default" and "some specific implementation had a bug that didn't parse negative numbers properly" here because I'm really not sure whether it's trolling or something else.

If you need to discuss it further, create a separate issue.

CreepySkeleton commented 4 years ago

It looks like we're throwing out the same arguments again and again. For the sake on moving somewhere, let's try to come to some sort of consensus on the following topics:

Should/can man pages be the same as --help or they should be more descriptive?
Should manpage generator use the same input as --help messages generator?
Should the data be compiled into the executable or be a separate asset?

kbknapp commented 4 years ago

Admin Note

The discussion here got a little heated and derailed, which is totally possible and common with online discussions.

Just a reminder for everyone to remain civil, even when differences of opinion exist. We can, and I expect us to debate technical issues, but those debates must remain calm and professional.

This issue will remain locked for the next 48 hours to allow everyone involved a chance to take a break and perhaps consider other angles. Personally, I think this is a great time to attempt to see something from another's point of view.

One final note, as we're discussing technical details this thread is already very long so lets do our best to keep the discussion related to this issue, and within the realm of what is actionable by clap code. If there are other debates to be had, lets move them to chat, GH discussions, or dedicated issues here/issue trackers on other projects.

I appreciate the passion everyone is displaying for this feature, I want it to land and I want a design that pushes this library forward. I think we all want that :smile:

dkg commented 3 years ago

@CreepySkeleton wrote:

* Should/can man pages be the same as `--help` or they should be more descriptive?

I think that --help should typically produce a brief reference, and man pages should provide a superset of that information. they can be more descriptive, and include more details/nuance/gotchas/etc that would be inappropriate to view in the console running --help. But the point here is superset. If something shows up in --help it definitely should be visible (and easy to find) in the manpage. And of course it should be in sync.

* Should manpage generator use the same input as `--help` messages generator?

Yes, i think it should (because we want them to stay in sync). Whatever is generating these things should be pulling from the same place if possible.

* Should the data be compiled into the executable or be a separate asset?

I think that the details/nuance/etc for the manpage doesn't need to be compiled into the executable, but if it happens by accident (or because it's more convenient for the text to be stored/manipulated/translated that way) then it's not much of a problem (space is the only concern i see here, and rust binaries are large enough anyway that i can't imagine a few paragraphs of text being significant).

Thanks for thinking about this and working on it! I'd really like to see convenient autogeneration of manpages for clap binaries (e.g. in sequoia)

alerque commented 3 years ago

Pretty much everything that @dkg said, except that man page content should be generated and stored in a separate resource file. Many distros require that this be packaged and the build and package routines to extract it by running the binary to reconstruct the source would be more complicated than need be, packing a man source file is much easier and a very standard operation.

anthraxx commented 3 years ago

Pretty much everything that @dkg said, except that man page content should be generated and stored in a separate resource file. Many distros require that this be packaged and the build and package routines to extract it by running the binary to reconstruct the source would be more complicated than need be, packing a man source file is much easier and a very standard operation.

As being a distro packager myself i strongly disagree. All that matters as a distro packager is that it's reasonably easy to invoke the transformation chain. All that matters as a developer is that it's reasonably easy to read/write and convenient to keep it in sync with the rest of the definitions -- the CLI --help declaration. All that matters as a user is that you can easily obtain manpages and completions no matter if you pulled a software out of a distro repository or via cargo install.

Clip already does exactly that for completions and frankly it is no disadvantage for packagers to generate them out of a binary before putting them somewhere else. This could even be made easier by downstream projects by providing an appropriate Makefile whose interaction with won't be any different from what an average packagers already uses elsewhere.

Dzordzu commented 2 years ago

Is there any way to generate json schema from the command?

epage commented 2 years ago

At this time, we do not. The issue for that is https://github.com/clap-rs/clap/issues/918.

The clap-serde project could be a good place for this. See https://github.com/aobatact/clap-serde/issues/11.

tgross35 commented 1 year ago

For anyone like me who stumbles across this long issue but can't find the correct answer: the official crate is clap_mangen (it's not mentioned anywhere on this thread)

clap-rs / clap

Auto-generate manpage, help docs, etc. #552

help2man can be a source of inspiration for how to integrate this into a users process

Admin Note