proposal: x/tools/go/analysis/analysisutil: publish ExtractDoc

hyangah commented 1 year ago

ExtractDoc is a convention originally developed to fix a go documentation issue and adopted in all analyzers in x/tools/go/analysis/passes. This allows analyzers to provide consistent documentation in pkg.go.dev, go doc, and *checkers.

I think this is a useful convention that can be adopted by any x/tools/go/analysis analyzers, and propose to move it to public so other analyzer authors can adopt and use this convention.

golang.org/x/tools/go/analysis/analysisutil is an option.

cc @adonovan

cherrymui commented 1 year ago

I think new public API on x/tools still needs a proposal. So maybe you want to file a proposal. Thanks.

cc @timothy-king @matloob @zpavlinovic

adonovan commented 1 year ago

We propose to create a new package, analysisutil, that initially holds the existing internal ExtractDoc and MustExtractDoc functions. Other functions may follow in due course.

So maybe you want to file a proposal. Thanks.

Yep. Fixed that for you, Hana. ;-)

adonovan commented 1 year ago

Perhaps this function belongs in go/analysis itself, alongside the Analyzer interface. We're careful about adding to that package to avoid unnecessary dependencies, but in this case it doesn't add any.

(The widely used analysisutil.Format and Imports helpers also don't add dependencies, and I think they too should be published, but perhaps a separate analysisutil package still makes sense for these sorts of things which are useful only to programs on one side the driver/checker interface.)

rsc commented 1 year ago

Many of the tools in x/tools seem to embed their doc comments and print them as help text. I am not convinced this is the way we should write programs. I've always been frustrated by sessions like:

% go doc eg
The eg command performs example-based refactoring. For documentation, run the
command, or see Help in golang.org/x/tools/refactor/eg.
% eg
eg: an example-based refactoring tool.

Usage: eg -t template.go [-w] [-transitive] <packages>

-help            show detailed help message
-t template.go   specifies the template file (use -help to see explanation)
-w               causes files to be re-written in place.
-transitive      causes all dependencies to be refactored too.
-v               show verbose matcher diagnostics
-beforeedit cmd  a command to exec before each file is modified.
                 "{}" represents the name of the file.
% eg -help

This tool implements example-based refactoring of expressions.

The transformation is specified as a Go file defining two functions,
'before' and 'after', of identical types.  Each function body consists
of a single statement: either a return statement with a single
(possibly multi-valued) expression, or an expression statement.  The
'before' expression specifies a pattern and the 'after' expression its
replacement.

    package P
    import ( "errors"; "fmt" )
    func before(s string) error { return fmt.Errorf("%s", s) }
    func after(s string)  error { return errors.New(s) }

The expression statement form is useful when the expression has no
result, for example:

    func before(msg string) { log.Fatalf("%s", msg) }
    func after(msg string)  { log.Fatal(msg) }

The parameters of both functions are wildcards that may match any
expression assignable to that type.  If the pattern contains multiple
occurrences of the same parameter, each must match the same expression
in the input for the pattern to match.  If the replacement contains
multiple occurrences of the same parameter, the expression will be
duplicated, possibly changing the side-effects.

The tool analyses all Go code in the packages specified by the
arguments, replacing all occurrences of the pattern with the
substitution.

So, the transform above would change this input:
    err := fmt.Errorf("%s", "error: " + msg)
to this output:
    err := errors.New("error: " + msg)

Identifiers, including qualified identifiers (p.X) are considered to
match only if they denote the same object.  This allows correct
matching even in the presence of dot imports, named imports and
locally shadowed package names in the input program.

Matching of type syntax is semantic, not syntactic: type syntax in the
pattern matches type syntax in the input if the types are identical.
Thus, func(x int) matches func(y int).

This tool was inspired by other example-based refactoring tools,
'gofmt -r' for Go and Refaster for Java.

LIMITATIONS
===========

EXPRESSIVENESS

Only refactorings that replace one expression with another, regardless
of the expression's context, may be expressed.  Refactoring arbitrary
statements (or sequences of statements) is a less well-defined problem
and is less amenable to this approach.

A pattern that contains a function literal (and hence statements)
never matches.

There is no way to generalize over related types, e.g. to express that
a wildcard may have any integer type, for example.

It is not possible to replace an expression by one of a different
type, even in contexts where this is legal, such as x in fmt.Print(x).

The struct literals T{x} and T{K: x} cannot both be matched by a single
template.

SAFETY

Verifying that a transformation does not introduce type errors is very
complex in the general case.  An innocuous-looking replacement of one
constant by another (e.g. 1 to 2) may cause type errors relating to
array types and indices, for example.  The tool performs only very
superficial checks of type preservation.

IMPORTS

Although the matching algorithm is fully aware of scoping rules, the
replacement algorithm is not, so the replacement code may contain
incorrect identifier syntax for imported objects if there are dot
imports, named imports or locally shadowed package names in the input
program.

Imports are added as needed, but they are not removed as needed.
Run 'goimports' on the modified file for now.

Dot imports are forbidden in the template.

TIPS
====

Sometimes a little creativity is required to implement the desired
migration.  This section lists a few tips and tricks.

To remove the final parameter from a function, temporarily change the
function signature so that the final parameter is variadic, as this
allows legal calls both with and without the argument.  Then use eg to
remove the final argument from all callers, and remove the variadic
parameter by hand.  The reverse process can be used to add a final
parameter.

To add or remove parameters other than the final one, you must do it in
stages: (1) declare a variant function f' with a different name and the
desired parameters; (2) use eg to transform calls to f into calls to f',
changing the arguments as needed; (3) change the declaration of f to
match f'; (4) use eg to rename f' to f in all calls; (5) delete f'.
%

This seems backward: the doc comment, which supports actual formatting (even moreso after #51082), contains no content at all. Instead, the only way to read documentation is in a terminal window as plain text.

I had not seen ExtractDoc until a couple weeks ago when I was editing an existing analyzer. That's definitely an improvement, in that package docs have become meaningful, but I remain skeptical about encouraging the pattern where tools embed their entire documentation and print it on -help. That is what 'go doc' is meant to do.

My inclination would be to go back to short documentation in the Analyzer.Doc field, with elaborate detail only in the doc comment. At that point ExtractDoc is no longer necessary.

That is, I think ExtractDoc is a symptom of a different problem/anti-pattern, and we should think about fixing that instead.

I will also note that when I was writing a trivial Analyzer the other day, I was disappointed that the analyzer refused to run just because Doc was an empty string. If we want to insist on documentation for our own analyzers, that seems like a reasonable thing to do in a test in go/analysis/passes. It does not seem reasonable to refuse to make the analyzer itself break. The go command does not refuse to build packages or run main programs when they are missing doc comments.

hyangah commented 1 year ago

@rsc I think it's a good idea to have more detailed description in more expressive place - e.g. algorithm, background, intro - and have concise, usage oriented short information distributed with the binaries so they are easily accessible.

Analyzer.Doc allows integrators to access the documentation conveniently. For example gopls generates its own documentation using info in the Analyzer.Doc field. VS Code Go picks up analyzers' documentation from gopls and presents the doc in VS Code UI and its own documentation. These short snippets goes a long way - beyond go's source package doc or the checker binary's help message. I think maintaining the amount of text/info in the Analyzer.Doc within reasonable size will help these various documentation integration points.

As long as the info is consistent.

Most analyzers in x/tools repo don't have very long description. Asking users to visit pkgsite to read a few extra sentences about an analyzer seems silly. On the other hand, keeping two copies of texts for package doc and Analyzer.Doc isn't ideal. ExtractDoc offers utility to maintain the consistency between the package doc and Analyzer.Doc without duplicating some contents manually. Users are still free to write more info in the package doc (outside what's extracted by ExtractDoc), or to drop a url to a fancy website in the Analyzer.Doc field.

I am open to other ideas to help maintaining consistency.

I was disappointed that the analyzer refused to run just because Doc was an empty string.

I agree that's not great. I think that needs to be changed independent of this issue.

adonovan commented 1 year ago

This seems backward: the doc comment, which supports actual formatting (even more so after https://github.com/golang/go/issues/51082), contains no content at all. Instead, the only way to read documentation is in a terminal window as plain text.

I've always found it useful that tools like go list provided comprehensive documentation in the terminal in which I'm working, via go help list. Perhaps your point is that the output of go list -h is actually very terse and merely directs the user to longer documentation, in this case go help, which is almost a separate tool analogous to go doc or a browser displaying pkg.go.dev/foo. We could make tools print only a usage and a URL, but I would still have to leave the terminal to read the HTML. Unfortunately, unless I have the module in my workspace, go doc golang.org/x/tools/cmd/callgraph doesn't show the document in my terminal, which is what I actually want (half the time, anyway).

I will add that in all the years I've been writing UNIX CLI tools I can still never remember what the normal expectations are for how much documentation is printed with no args, -h, -help, and so on, and I always feel like I'm just winging it. Perhaps if you or others have a particular set of expectations it would be good for us to write then down, and link to them from a prominent place such as the doc comment for flag.Parse. I'd be happy to make the x/tools consistent.

I will also note that when I was writing a trivial Analyzer the other day, I was disappointed that the analyzer refused to run just because Doc was an empty string.

It may seem overzealous, but the validate function has always checked for documentation, since otherwise the UI of (e.g.) vet degrades by showing a gap in its list of analyzers. I recently made the test infrastructure call validate where it didn't used to before, because it was lacking some more crucial precondition checks; surely tests are the best place to catch that. In your case you were writing a test of infrastructure with some throwaway analyzers, but this is unusual: most users are writing tests of analyzers that will be used in production, so I think the Doc check is appropriate, even though it was inconvenient for you (and me in similar tests of throwaway analyzers).

rsc commented 1 year ago

This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group

adonovan commented 1 year ago

After discussion with @rsc, I move that we retract this proposal in favor of https://github.com/golang/go/issues/63659.

hyangah commented 1 year ago

I don't understand how #63659 replaces this proposal. IIUIC #63659 talks about the command line tool, but this proposal is focusing on documentation of a library (in particular, an analyzer), that can be linked to and part of any command line tools, application, or services. How much of information and what medium the downstream application chooses to present is up to them. This proposal is about formalizing the style of analyzer documentation and enforcing it by providing a tool any analyzer integrator can easily reach out to.

adonovan commented 11 months ago

@hyangah You're right that the other proposal about CLI tools doesn't directly address this question about library packages, but I think Russ is arguing that the same approach should work for both. A CLI tool needs only a short summary plus a link to the package documentation. An Analyzer needs a one-line description for the 'go vet' menu of analyzers, and also a short description suitable for a diagnostic popup in an LSP-enabled editor, but the full details can be relegated to the package documentation to which the Analyzer links. In both cases, the user of the CLI or GUI tool never sees the full documentation while using the program itself; for that, they need to use pkg.go.dev or go doc.

If there's a consensus around that, then we wouldn't need ExtractDoc, we would just need better guidance on what belongs in doc.go, and how to write a good one-paragraph Analyzer.Doc string.

adonovan commented 11 months ago

In discussion this morning @hyangah told me she was ok with that approach (Analyzer.Doc is 1 line + 1 short paragraph; Analyzer.URL links to the comprehensive docs), but lamented that (a) it still entails some amount of redundancy between the paragraph and the package doc; (b) it's not possible to reliably turn Analyzer.URL into a go doc package@version command, when you want to stay in the shell; and (c) as a commuter through the wifi-free rail tunnels of New Jersey, relegating complete docs to a website is an unfortunate inconvenience.

gopherbot commented 11 months ago

Change https://go.dev/cl/547877 mentions this issue: gopls/internal/analysis: improve analyzer docs

adonovan commented 10 months ago

@hyangah, if you agree with my previous note, do you want to retract this proposal by closing the issue?

hyangah commented 9 months ago

Retracting.

rsc commented 9 months ago

This proposal has been declined as retracted. — rsc for the proposal review group

golang / go

proposal: x/tools/go/analysis/analysisutil: publish ExtractDoc #61315