sdboyer / gps

your dependencies have arrived
MIT License
270 stars 24 forks source link

Read the destination of named symbolic link #157

Closed ReSTARTR closed 7 years ago

ReSTARTR commented 7 years ago

FileInfo.IsDir() returns false if it is symlink. But package is sometime defined under symlink. So should follow the symlink.

codecov-io commented 7 years ago

Codecov Report

Merging #157 into master will decrease coverage by -0.05%. The diff coverage is 46.66%.

@@            Coverage Diff             @@
##           master     #157      +/-   ##
==========================================
- Coverage   78.88%   78.84%   -0.05%     
==========================================
  Files          24       23       -1     
  Lines        3709     3668      -41     
==========================================
- Hits         2926     2892      -34     
+ Misses        582      573       -9     
- Partials      201      203       +2
Impacted Files Coverage Ξ”
analysis.go 83.48% <46.66%> (-1.82%) :x:
cmd.go
trace.go 73.19% <0%> (+0.97%) :white_check_mark:

Continue to review full report at Codecov.

Legend - Click here to learn more Ξ” = absolute <relative> (impact), ΓΈ = not affected, ? = missing data Powered by Codecov. Last update 8f8d2e5...425722b. Read the comment docs.

sdboyer commented 7 years ago

This is a tricky issue. Go's toolchain in general is touchy about symlinks; I want to avoid introducing new, complex behavior without a clear understanding of the implications. This change would permit symlinks in contexts that the go toolchain might (?) not.

So, it's not a firm "no," but rather a "I need more information about exactly how the go toolchain currently works with symlinks before I can decide."

ReSTARTR commented 7 years ago

Thank you for your comment.

"I need more information about exactly how the go toolchain currently works with symlinks before I can decide."

Sorry, I'm not sure that go's toolchain permit symlinks or not. But, for example, it seems that goimports might follow symlinks. (filepath.Walk that is used in gps calls os.Lstat instead of os.Stat.)

I used https://github.com/douban/libmc. The repository includes symlink that is named golibmc. (I think this case might be rare.)

When I execute dep init, the result is like this.

$ cat main.go
package main

import (
        "fmt"
        "github.com/douban/libmc/golibmc"
)

func main() {
        fmt.Println(golibmc.HashMD5)
}

$ dep init
ouchie, solve error: No versions of github.com/douban/libmc met constraints:
        master: Could not introduce github.com/douban/libmc@master, as its subpackage github.com/douban/libmc/golibmc is missing. (Package is required by (root).)
        v1.1.0: Could not introduce github.com/douban/libmc@v1.1.0, as its subpackage github.com/douban/libmc/golibmc is missing. (Package is required by (root).)
        v1.0.1: Could not introduce github.com/douban/libmc@v1.0.1, as it is not allowed by constraint master from project github.com/ReSTARTR/go-dep.
        v1.0.0: Could not introduce github.com/douban/libmc@v1.0.0, as it is not allowed by constraint master from project github.com/ReSTARTR/go-dep.
        v0.5.6: Could not introduce github.com/douban/libmc@v0.5.6, as it is not allowed by constraint master from project github.com/ReSTARTR/go-dep.
        v0.5.5: Could not introduce github.com/douban/libmc@v0.5.5, as it is not allowed by constraint master from project github.com/ReSTARTR/go-dep.
        v0.5.4: Could not introduce github.com/douban/libmc@v0.5.4, as it is not allowed by constraint master from project github.com/ReSTARTR/go-dep.
        v0.5.3: Could not introduce github.com/douban/libmc@v0.5.3, as it is not allowed by constraint master from project github.com/ReSTARTR/go-dep.
        v0.5.2: Could not introduce github.com/douban/libmc@v0.5.2, as it is not allowed by constraint master from project github.com/ReSTARTR/go-dep.
        v0.5.1: Could not introduce github.com/douban/libmc@v0.5.1, as it is not allowed by constraint master from project github.com/ReSTARTR/go-dep.
        v0.5.0: Could not introduce github.com/douban/libmc@v0.5.0, as it is not allowed by constraint master from project github.com/ReSTARTR/go-dep.
        v0.5.0: Could not introduce github.com/douban/libmc@v0.5.0, as it is not allowed by constraint master from project github.com/ReSTARTR/go-dep.
        master: Could not introduce github.com/douban/libmc@master, as its subpackage github.com/douban/libmc/golibmc is missing. (Package is required by (root).)
        stable: Could not introduce github.com/douban/libmc@stable, as it is not allowed by constraint master from project github.com/ReSTARTR/go-dep.No versions of github.com/douban/libmc met constraints:
        master: Could not introduce github.com/douban/libmc@master, as its subpackage github.com/douban/libmc/golibmc is missing. (Package is required by (root).)
        v1.1.0: Could not introduce github.com/douban/libmc@v1.1.0, as its subpackage github.com/douban/libmc/golibmc is missing. (Package is required by (root).)
        v1.0.1: Could not introduce github.com/douban/libmc@v1.0.1, as it is not allowed by constraint master from project github.com/ReSTARTR/go-dep.
        v1.0.0: Could not introduce github.com/douban/libmc@v1.0.0, as it is not allowed by constraint master from project github.com/ReSTARTR/go-dep.
        v0.5.6: Could not introduce github.com/douban/libmc@v0.5.6, as it is not allowed by constraint master from project github.com/ReSTARTR/go-dep.
        v0.5.5: Could not introduce github.com/douban/libmc@v0.5.5, as it is not allowed by constraint master from project github.com/ReSTARTR/go-dep.
        v0.5.4: Could not introduce github.com/douban/libmc@v0.5.4, as it is not allowed by constraint master from project github.com/ReSTARTR/go-dep.
        v0.5.3: Could not introduce github.com/douban/libmc@v0.5.3, as it is not allowed by constraint master from project github.com/ReSTARTR/go-dep.
        v0.5.2: Could not introduce github.com/douban/libmc@v0.5.2, as it is not allowed by constraint master from project github.com/ReSTARTR/go-dep.
        v0.5.1: Could not introduce github.com/douban/libmc@v0.5.1, as it is not allowed by constraint master from project github.com/ReSTARTR/go-dep.
        v0.5.0: Could not introduce github.com/douban/libmc@v0.5.0, as it is not allowed by constraint master from project github.com/ReSTARTR/go-dep.
        v0.5.0: Could not introduce github.com/douban/libmc@v0.5.0, as it is not allowed by constraint master from project github.com/ReSTARTR/go-dep.
        master: Could not introduce github.com/douban/libmc@master, as its subpackage github.com/douban/libmc/golibmc is missing. (Package is required by (root).)
        stable: Could not introduce github.com/douban/libmc@stable, as it is not allowed by constraint master from project github.com/ReSTARTR/go-dep.

So, if gps can follow the symlink, this problem will be fixed.

sdboyer commented 7 years ago

OK, a compromise - if we can verify that the symlink doesn't escape the root of the tree, then we can follow it. If not, it should be skipped.

Let's do this in two phases. Initially, in this PR, let's have it be:

  1. All absolute symlinks are disqualified; if one is encountered, it should be skipped.
  2. Relative symlinks pointing to somewhere outside of the root (via ..) should also be skipped.

In the longer term, though, we can try to accept some absolute symlinks, assuming we can demonstrate them to be children of fileRoot. The reason I don't want to tackle that now is because of the bizarre loops and escape routes I'm pretty sure are possible to construct with symlinks; Note, from the filepath.Abs docs:

The absolute path name for a given file is not guaranteed to be unique.

I'd rather not introduce the possibility of failures, errors, or possibly even security issues until I better understand the implications. (There's a reason Rob Pike hates symlinks.)

Does this seem sane?

sdboyer commented 7 years ago

Oh, also - if we add this, then it definitely needs full test coverage. There's a set of fixtures in _testdata/src - add some new dirs in there that represent each of the permutations (absolute symlink, acceptable relative symlink, unacceptable relative symlink). Then, add some corresponding test cases to TestListPackages in analysis_test.go. There's a ton there already - plenty of examples to pattern after.

Thanks!

ReSTARTR commented 7 years ago

Okay, I got it. I also think that we can permit only symlinks that doesn't escape from fileRoot.

I'll try to fix this changes and add some test cases πŸ‘

ReSTARTR commented 7 years ago

I have fixed some code, and added test cases. Please take another look and give me some advise.

I have an another issue:cry: os.FileInfo.Mode()&os.ModeSymlink != 0 will be false if run on windows. Because of this, test success on *nix system, but fail on windows. So I had no choice but to skip symlink tests only if on windows.

How about this idea?

ReSTARTR commented 7 years ago

I have changed some lines on your comments. How does it look?

sdboyer commented 7 years ago

Great, that's looking a lot better. Let's add two more test cases, then I'll merge:

  1. An absolute symlink. Doesn't matter where it points - os.Readlink() should still just return the target, even if the symlink is broken. Probably worth having a wholly separate fixture for that.
  2. Non-sibling symlinking cases. The fact that the old version of the code worked with the fi.Name() call makes me trust the tests less. So, within the existing _testdata/src/symlinks fixture you added, let's have a symlink nested within a subdirectory pointing to an immediate parent (e.g. foo/sym -> pkg) and then a top-level subdirectory pointing to a nested dir (e.g. sym -> bar/baz). Does that make sense?
sdboyer commented 7 years ago

That failure is erroneous; for some reason it's tied to CircleCI deciding to build under your account, rather than mine. I don't know why it does that, but it's not the first time I've seen it. I need to switch to Travis.

Just as I was about to hit the merge buttton, I realized there's another, much more annoying case: if the symlink points to a valid directory, that directory might also have children. It would be very odd to include the contents of the referenced directory, but not the directory itself.

Unfortunately, this is a trickier fix. I think our only option is just to abandon the use of filepath.Walk() entirely, as it always ignores symlinks, replacing it with our own function with similar logic. We'd then need to take the checks you've already created here and promote them into this new walking function, so that instead of deciding whether or not to dereference and analyze just that directory, the decision is whether to descend into the directory referenced by the symlink.

Does that make sense?

ReSTARTR commented 7 years ago

It would be very odd to include the contents of the referenced directory, but not the directory itself.

$ tree _testdata/src/symlinks
_testdata/src/symlinks
β”œβ”€β”€ broken -> nodest
β”œβ”€β”€ foo
β”‚Β Β  β”œβ”€β”€ bar -> ../../pkg
β”‚Β Β  └── foo.go
β”œβ”€β”€ foobar -> foo/bar
β”œβ”€β”€ gopkg -> pkg
β”œβ”€β”€ pkg
β”‚Β Β  β”œβ”€β”€ bar -> bar # <- Is it about this?
β”‚Β Β  └── gopkg.go
└── symlinks.go

It's my mistake ;-( I'll fix it.

We'd then need to take the checks you've already created here and promote them into this new walking function, so that instead of deciding whether or not to dereference and analyze just that directory, the decision is whether to descend into the directory referenced by the symlink.

I agree with it. So I'm going to change like this. What about do you think of this?

// walkDir wraps filepath.Walk. Analyze file type before calling walkFn.
func walkDir(fileRoot string, walkFn filepath.WalkFunc) error {
    return filepath.Walk(fileRoot, func(wp string, fi os.FileInfo, err error) error {

        // analyze the file type

        return walkFn(wp, fi, err)
    }
}
sdboyer commented 7 years ago

It's my mistake ;-( I'll fix it.

No problem - I missed it too, until I actually thought it through just before hitting the merge button!

β”‚   β”œβ”€β”€ bar -> bar # <- Is it about this?

Yes, though it would be a better example if bar were a plain directory rather than another symlink.

What about do you think of this?

I don't think that's going to be quite enough. We need a full replacement for filepath.Walk() - one that obeys our symlink rules. We can't reuse filepath.Walk() at all, because it would mean we miss nested symlinks.

ReSTARTR commented 7 years ago

We need a full replacement for filepath.Walk()

Mmm...it is a big change, isn't it? Well then, what about following two steps?

First step is like this. (ignore the rare case that filepath.Walk() can't treat as well)

package gps

func ListPackages(fileRoot, importRoot string) (PackageTree, error) {
    // ...
    err = filepath.Walk(fileRoot, func(wp string, fi os.FileInfo, err error) error {
        fi, err = stat(wp)
    }
    // ...
}

func stat(path string) (os.FileInfo, error) {
    // promote the lines I've created until now
}

And, we try to replace filepath.Walk() with mystat and new walking function in second step (as an another issue).

ReSTARTR commented 7 years ago

I tried to promote the lines that reads symlink into readSymlink function. So, please take another look?

I think that we should not replace filepath.Walk() at this time. Because it needs a big changes (As mentioned above.)

sdboyer commented 7 years ago

Ugh, sorry, I totally missed responding to your comment from three weeks ago. I read it, forgot to respond, but then it was out of my notifications so I didn't check back. But, the new changes you pushed look good.

I'm hesitant about merging this without addressing the bigger problem. gps is still in prerelease, but I'd be uncomfortable making a release with this problem only half-solved. And I have a bunch of other fixes I'm trying to fire out right now, then rapidly roll releases - having this only partially done would make that difficult.

I don't actually think the change is that bad, though. We just need to implement our own version of filepath.Walk(), which isn't a terribly complicated function. We can even hoist some of the logic from the current implementation that decides whether or not to visit a path up into the walker.

That said, I realize that is a larger change. If that's not something you're comfortable doing, that's no problem - we can merge as-is, and I'll make the necessary changes in a followup.

ReSTARTR commented 7 years ago

Thank you for checking!

I'm afraid that I don't fix that problem. But I thought that it might cause some conflicts with another changes ;-( (Though, I agree with the opinion that you mentioned)

So I'd be grad you to merge and make another changes. Could you please do it?

sdboyer commented 7 years ago

Yep, no problem - I'll merge, then take care of the rest.