golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.42k stars 17.71k forks source link

x/tools/gopls: rename: modify [pkg.Symbol] doc links in non-doc comments too #57559

Open thomaspeugeot opened 1 year ago

thomaspeugeot commented 1 year ago

Doc Links has been added to the go in 1.19 and it is accessible in the AST. Enabling refactoring of DocLinks will improve the maintainability of documentation.

What version of Go are you using (go version)?

$ go version
go version go1.19.1 windows/amd64
$  gopls version
golang.org/x/tools/gopls v0.11.0
$ vscode-go v0.37

What did you do?

type Foo struct {
}
// struct [Foo] is a nice struct ...

Refactor "Foo" to "Bar"

What did you expect to see?

type Bar struct {
}
// struct [Bar] is a nice struct ...

What did you see instead?

type Bar struct {
}
// struct [Foo] is a nice struct ...
findleyr commented 1 year ago

Thanks, this is a good point. We should extend our identifier detection to include the new doc link format.

thomaspeugeot commented 1 year ago

Because DocLink renaming is not there yet (the present issue), this comment presents another motivation and a partial workaround for a use case that arises when DocLink is used for data persistence, when go code is used to store data.

The repo code for this use case, including the workaround, is https://github.com/fullstack-lang/issuerenaming.

Data persisted as go code

It seems counter intuitive to store data as go code but there is a use case where the go code format is more economical than alternatives (such as a database or a json file). This use case can occur during the development of an application.

In an example for this use case, we have three artifacts:

Screenshot 2023-02-11 at 11 30 23

Now, our client says "Foo is not the perfect name, Bar is a better". We return to the keyboard and refactor the application in three steps. First, we rename Foo to Bar in our go code thanks to gopls. Second, we rename Foo to Bar in our json file. Third, we rename Foo to Bar in our UML diagram. The whole operation may have taken 5 minutes to 15 minutes, of which maybe 90 % are for the second and third steps. Those 90% are a distraction of the coding process and we want to slash them.

Now, let's say that instead of persisting data in json, we store our data as go code and develop a generic AST parser to read the data. The data file in go contains declarations lines like:

    __Foo__000000_A := (&models.Foo{Name: `A`})

This new persistence approach makes the refactoring simpler because, in the first renaming step from Foo to Bar, gopls will also edit this go file into:

    __Foo__000000_A := (&models.Bar{Name: `A`})

We save the json refactoring step. We save (boring) minutes.

UML diagram persisted as go code

Bypassing the json is economical but now, we would like to also bypass the UML refactoring step. We would like to have the UML diagram data refactored by gopls. For enabling renaming, we redo what we did with the test data : namely we encode the UML diagram in go code.

In the go code describing the UML diagram, we need to say that our class shape that represents the Foo struct actually refers to the models.Foo struct in the go code. Problem, if we encode the reference to Foo in a string "models.Foo", in something like

   shape.ref = "models.Foo"

our renaming step is not automatic. "models.Foo" wont be renamed into "models.Bar" when gopls processes the renaming request because shape.ref is a string, it cannot be renamed by gopls.

Enters go 1.19 and DocLink

With the DocLink syntax such as [Name1, [Name1.Name2] or [pkg.Name1], go 1.19 implicitly introduces a new kind of type, a type for identifiers. The good piece of news is that this type can be assigned to a string variable.

Technically, this can be done with a comment directive followed by a DocLink item. The go 1.19 go/ast/comment.Parser package provides the concept and the dedicated parsing functions for getting DocLink in comments. The following assignment of the models.Foo identifier to a string can be done by parsing something like

  //gong:ident [models.Foo]
  shape.ref = ""

We can ask the generic AST parser to assign the DocLink text to the string value whenever it meets this kind of comment followed by a string assignment (note that //gong:ident works like a //go:embed directive, more on that later).

After the fix of the present issue, gopls will perform the renaming request of models.Foo to models.Bar and it will change the code to

  //gong:ident [models.Bar]
  shape.ref = ""

And the AST parser will assign the correct new value to the shape.

Conclusion, the refactoring of the three artifacts takes less that a minute, a nice productivity gain.

A partial workaround to the lack of DocLink renaming

We have seen the present issue impacts a use case. To overcome this limitation, we can develop a workaround. Technically, when marshalling the UML diagram, the workaround automatically generates a map of reference to identifiers to identifiers. The identifiers are instances of type any

var map_DocLink_Identifier map[string]any = map[string]any{
    "models.Foo": &(models.Foo{}).,

When gopls will perform the renaming request, it will change the above code to

var map_DocLink_Identifier map[string]any = map[string]any{
    "models.Foo": &(models.Bar{}).,

The Ast Parser has to parse this map and later understands that each time it meets a //gong:ident [models.Foo] directive, this means //gong:ident [models.Bar]` instead.

The workaround is hair splitting because some identifiers are more tricky to express as an any value. For instance, a string type type Waldo string may need to have an UML shape to represent it as an enum. One way to encode it in a any value is to express via a cast of an empty string models.Waldo("").

Here is the resulting UML diagram after renaming

Screenshot 2023-02-11 at 11 33 47

The workaround does not apply to comment like "note"

The workaround is partial because in some cases, the reference to the identifier (the DocLink) is not in the UML code but in the package itself. For instance:

// GONGNOTE(Note): [models.Foo] is
// related to [models.Waldo] throughs the field
// [models.Foo.Waldos]
const Note = ""

This note, a syntax authorized by go, can be translated to a UML note with UML note links but the workaround cannot apply to this because "[Foo]" is in the source package.

Is there a need for a //go:ident [<docLink>] directive ?

We have seen a very narrow use case: maintenance of a go code that is used for storing data about identifiers. However, this use case describes a need for assigning the text of a read only object to a string, something the embed directives already performs

//go:embed hello.txt
var s string

Using the same reasoning behind the field-proven //go:embed directive, we could have a new //go:ident directive, .

//go:ident [pkg.Foo]
var s string

When utilizing this directive, maintaining this code would significantly be simpler than maintaining the code below (if DocLink renaming is implemented).

var s string = "pkg.Foo"

It would be intriguing to explore if there are other potential applications of DocLink (with DocLink renaming) that are worth considering. DocLink is still a relatively new feature at only 6 months old.

adonovan commented 1 year ago

Summarizing my understanding of the previous long note: you want to use comments to represent tool-readable relationships between declarations, but renaming doesn't (in general) apply to the contents of comments, so the relationships become stale.

The problem is essentially the same as that of writing annotations for static checkers: you can do it with comments, but they quickly become stale. It's better to use Go syntax so that symbol references are first class. For example, when using a hypothetical lock-checking static analysis tool, you could express the annotation "mutex mu guards field f" as a comment:

type T struct {
    mu sync.Mutex // checklock:guards(mu, f)
    f int
}

but it would be more robust to use a Go declaration, for example:

import "checklock"

type T struct {
    mu sync.Mutex
    f int
}

func _(t *T) {
    checklock.Guards(t.mu, t.f) // (function call has no effect)
}

See https://go.dev/cl/489835 for a rough sketch of this approach.

thomaspeugeot commented 1 year ago

Summarizing my understanding of the previous long note: you want to use comments to represent tool-readable relationships between declarations, but renaming doesn't (in general) apply to the contents of comments, so the relationships become stale.

Your understanding is correct.

The problem is essentially the same as that of writing annotations for static checkers: you can do it with comments, but they quickly become stale. It's better to use Go syntax so that symbol references are first class. For example, when using a hypothetical lock-checking static analysis tool, you could express the annotation "mutex mu guards field f" as a comment:

...

See https://go.dev/cl/489835 for a rough sketch of this approach.

I agree the issue is for associations to declarations that become stale after refactoring. However, I have looked at https://go.dev/cl/489835 and I must admit that I am not sufficiently a go expert to understand it. For instance, I do not understand if the solution is for associations within comments or within a go declaration.

If I may formulate the requirements for the solution :

adonovan commented 1 year ago
  • the association to the declaration is within a comment
  • the association to the declaration is not staled when the declaration is renamed

The main idea in the CL I linked to is: it should be a requirement that annotations not be in comments precisely because this makes them stale, and hard to discover as ordinary references.

thomaspeugeot commented 1 year ago
  • the association to the declaration is within a comment
  • the association to the declaration is not staled when the declaration is renamed

The main idea in the CL I linked to is: it should be a requirement that annotations not be in comments precisely because this makes them stale, and hard to discover as ordinary references.

Ok, I understands the CL better now. Unfortunatly, it won't solve the issue of stale DocLink because DocLink are in comments. I did not surmise it was hard to discover them.

I am OK if something else than DocLink is available in comments for referencing declarations.

adonovan commented 6 months ago

This appears to a be a dup of https://github.com/golang/go/issues/64495, which is fixed, and I verified that renaming a symbol renames (at least some) doc links that refer to it.

thomaspeugeot commented 6 months ago

Hi @adonovan , I respectfully disagree, unless mistaken, #64495 (or gops v0.15.3) fixes doc links for functions but does not fixes doc links for structs.

issue_rename_57559 % gopls version
golang.org/x/tools/gopls v0.15.3
package main

type Foo struct {
}

// Rename does not work for Foo
// struct [Foo] is a nice struct ...

func FuncA() {
}

// Rename works for FuncA
// FuncB is same as [FuncA].
func FuncB() {
}

Above code is available at https://github.com/thomaspeugeot/issue_rename_57559

adonovan commented 6 months ago

I respectfully disagree, unless mistaken, https://github.com/golang/go/issues/64495 (or gops v0.15.3) fixes doc links for functions but does not fixes doc links for structs.

I just tested gopls@master on this program:

// The Foo type is used with [Bar].
type Foo struct {}

// func Bar uses [Foo].
func Bar(Foo) {}

and was able to rename both Foo and Bar, and observe both the ordinary and the doc-link (bracketed) references change consistently.

thomaspeugeot commented 6 months ago

Indeed, it works in the above case, but it does not work when [Foo] is in a standalone comment, as in the last line of the code below

// The Foo type is used with [Bar].
type Foo struct{}

// func Bar uses [Foo].
func Bar(Foo) {}

// some comment about [Foo]

https://github.com/golang/go/assets/10234087/b39abddb-032b-4f28-ac9e-11cfe8dd38fc

adonovan commented 6 months ago

A free-standing comment is not a doc comment: it never appears in go doc, or in gopls' Hover output, and it isn't subject to reformatting when you save the file. So I think this is working as intended.

thomaspeugeot commented 6 months ago

Ouh la la...., I had never realized that there was two classes of comments, the doc comment and the free standing comments.

Sorry to bother but why not subject doc-like link in free-standing comments to gopls' renaming ?

Those comments are not exported by go doc but they deserve as much care. When they contain references to identifiers and one is renamed, they loose their consistency. IMHO, adding free-standing comments to the renaming net would improve code maintainability.

adonovan commented 6 months ago

Sorry to bother but why not subject doc-like link in free-standing comments to gopls' renaming ?

Because they are not doc comments, as defined by the first line of https://golang.org/doc/comment, so they are not subjected to the same rules by tools such as gofmt, go/doc/comment, and gopls.

mvdan commented 6 months ago

@adonovan it's worth noting that doc links in freestanding comments do work when using go-to-definition, and I rely on that on a weekly basis for e.g. freestanding notes that document quirks about the code. I hope that go-to-definition working in those cases is not by accident, and I would have assumed that renaming would also handle them for the sake of consistency, given that gopls does seem to understand they are doc links in some way.

thomaspeugeot commented 6 months ago

@adonovan , BTW, renaming of field does not seem to work.

https://github.com/golang/go/assets/10234087/a2bb8030-ed38-4cbd-9998-e12b7650e3bf

adonovan commented 6 months ago

renaming of field does not seem to work.'

Again, working as intended: doc links never refer to fields, only to packages, package-level declarations, and methods.

adonovan commented 6 months ago

After discussion, reopening as a narrower request to rename doc links in non-doc comments too (by analogy with Hover and Documentation, which support doc links in non-doc comments).