rsc / rf

A refactoring tool for Go
BSD 3-Clause "New" or "Revised" License
668 stars 21 forks source link

rf: type checking fails when it should not #9

Open rogpeppe opened 3 years ago

rogpeppe commented 3 years ago

When I try to run rf on our code repo, I get lots of errors like this:

../../objectstore/common/fetch_exported_test.go:22:71: cannot use hasher (variable of type *common.Hasher) as *common.Hasher value in argument to inmem.NewObjectStoreBucket

The package paths are identical and the types should be identical but the types.Named.obj fields differ, which I guess means that the types have come from different type checking sessions.

For the record, the actual command I was running was this:

rf 'ex {import "github.com/influxdata/idpe/models"; []models.Point -> models.Points}'
rogpeppe commented 3 years ago

It looks like it might have something to do with test files. When I remove all the _test.go files, it doesn't give an error.

An easy (but unfortunately large and slow) way to reproduce this issue:

git clone git@github.com:influxdata/influxdb
cd models
rf 'ex {import "github.com/influxdata/influxdb/v2/models"; []models.Point -> models.Points}'
rogpeppe commented 3 years ago

FWIW I just confirmed that the packages don't match. I just added this code under the *Named branch of Checker.identical0:

if x.obj.name == y.obj.name && x.obj.pkg.path == y.obj.pkg.path {
    panic(fmt.Errorf("Named mismatch on %s.%s; pkg match %v", x.obj.name, x.obj.pkg.path, x.obj.pkg == y.obj.pkg))
}

and I got this message:

Named mismatch on Hasher.github.com/influxdata/idpe/objectstore/common; pkg match false

My suspicion is that types are being cached by rf when they should not be.

rogpeppe commented 3 years ago

One other important piece of info: the final line printed is:

rf: errors found after script

implying that type checking did succeed initially, but failed after making the modifications.

rogpeppe commented 3 years ago

Having said that, when I tried the ex on multiple packages, I got a similar error from rf before it executed the script, so it's not just to do with the rewritten code.

rsc commented 3 years ago

I need to get some time to rethink the entire structure of this command. It's a good v0 but not a great v1. That's why I haven't really been promoting it much.

That said, the problem is as follows. We load a lot of different packages, and various equality tests assume that there's only one of each package and therefore only one of each package's type. But that's not always true in the presence of in-package *_test.go files. My original plan was to load the in-package *_test.go files as part of the package itself unconditionally, but that failed because there exist packages A and B where neither imports the other but A's test files import B and B's test files import A. There is no import cycle testing A or B in isolation, but if you are trying to do a "load the entire world" and you include test files with the package, then you have an import cycle.

I don't remember the current state of the rf loader. I think that I ended up handling A's test as a separate package but didn't trigger the necessary loads of "dependency D as built during A's test", because that created even more duplicate packages, and I wasn't handling duplicate packages well at all.

But the result is that this doesn't work: if you have a package A with both internal and external tests and the external tests import D which imports A, then rf does not at the moment create a separate "copy" of D for import by A's external tests, in which D's import of A gets the copy with A's internal tests added. But A's external tests' import of A does get that copy. That leads to the error you see if A's external tests need to pass a value of type A.something to D.

Maybe there's an easy fix I've overlooked. In the few times I've run into this, I do what you did - // +build ignore the test files and then update those by hand.

A good solution to this would hopefully also handle packages with build-specific files like x_386.go and x_amd64.go. Today you get one configuration and the others need manual updating.

mdempsky commented 3 years ago

FWIW, my solution in mdempsky/unconvert for handling multiple build configurations was to analyze packages under as many variations as the user requested, and then only remove conversions when they were determined unnecessary in all of the variations that actually involved the conversion. E.g., if the user requests analysis of variations A, B, and C, and file x.go's name and/or build tags only include it in builds for A and B, then I'd remove from x.go only the intersection of A and B's unnecessary conversions.

At least in principle, I think the same idea would work within rf. For each atomic rewrite operation, keep track of all of its edits across files, and then only keep it if it's consistent in all analysis cases.

Of course, the downside is this requires doing N separate analyses, and rf can already take a while to run. A 2x slowdown to handle tests might be reasonable, but it probably gets expensive to attempt to handle GOOS/GOARCH that way too.

rogpeppe commented 3 years ago

FWIW in my case I only care about one os/arch combination - I'd just like to be able to run rf at all on the code base I'm working on.

The issues that Russ is talking about were a large part of why I raised https://github.com/golang/go/issues/29258, but I don't have huge hopes of that going anywhere. Wouldn't it be nice though if we could have one view of the world per build tag combination?

myitcv commented 3 years ago

FWIW I'm seeing this error when using mv; i.e. the issue is not specific to ex.