OpenPeeDeeP / depguard

Go linter that checks if package imports are in a list of acceptable packages.
GNU General Public License v3.0
138 stars 15 forks source link

String Prefix -vs- Globbing for matching of import paths #2

Closed frou closed 6 years ago

frou commented 6 years ago

Hi there - Cool linter!

Imagine this config:

{
  "type": "whitelist",
  "packages": [
    "github.com/bob"
  ]
}

At a glance, there's nothing wrong with it. But, someone unrelated to Bob could register a new GitHub user called bob2 and their packages would pass the whitelist too!

There probably needs to be some logic that adds an implicit trailing slash (github.com/bob/) when it's knowable that the last part of a whitelisted string is a username.

dixonwille commented 6 years ago

@frou The above bug is by design in my opinion. The user would have to be vigilant and add the trailing slash in their config. The reason I can't add the trailing slash if it is supposed to be a username is if you think about local paths to packages (hardly used but supported) and packages that do not start with domain names (some mono repo structures). If you have a better solution let me know but I do think it is up to the user to be vigilant. I can add a note in the readme if you think that may clarify things.

Sorry for the late response. I did not get notified... Wonder what is up with that.

frou commented 6 years ago

It just seems that using "unstructured" string prefix matching and pushing the issue onto the user somewhat defeats the purpose of the linter (which is presumably about enforcing safety/auditability).

Maybe explicit gitignore-style glob patterns could be specified in the config rather than string prefixes?

github.com/bob/**
github.com/alice/proj/pkg/*
mycorp/**
dixonwille commented 6 years ago

Let me check what the standard lib supports in globbing as well as possible third-party packages. But as to not break anyone I may have to keep the prefix if globbing isn't used (* characters)? I do agree that it should enforce safety/auditability, but I am still under the impression that if the incorrect configs are provided, you will get incorrect results... But I will look into globbing (speed is a concern to me and the reason I went with prefixes).

frou commented 6 years ago

Unfortunately the very appropriate "double star" glob syntax that matches arbitrarily deep path components is not supported by the minimalist filepath.Match in the standard library :frowning_face:

Relevant blog post: https://www.client9.com/golang-globs-and-the-double-star-glob-operator/

dixonwille commented 6 years ago

I have used github.com/gobwas/glob before in https://github.com/dixonwille/skywalker and it seemed to have worked fine. But I will be checking the glob pattern way more often. You may also notice I do a binary search https://github.com/OpenPeeDeeP/depguard/blob/master/depguard.go#L131. If I add globbing it will be way more difficult and time-consuming to match against the glob pattern. I could have a flag in the config to turn on globbing but is off by default? I understand the more configs make it more difficult to consume but keeping the performance and introducing glob will be difficult without a clever mechanism. I'll keep digging after work.

dixonwille commented 6 years ago

Background

I am looking more into it tonight. I am not finding a solution other than loop through all the packages in the list and do a comparison. So that is O(n*m) (n is user list, m is direct import list) right. Well the current operation for checking to see if I need to flag an import is O(log(n)) meaning for the whole thing it isO(log(n) * m)) (this excludes the initial sort at the begining). For most use cases that is fine as with small numbers of n but as the user specifies more and more packages, you would see better performance gains with the current method.

I developed this linter to be ran along side other linters golangci-lint mainly as the tool to do so. With that in mind, I needed to be as performant as possible as to not slow down linting times (I dislike waiting on linters). I accomplished this in skywalker by having a separate list, if it was empty don't need a comparison, else check all other possablilities before I glob match.

Proposal

Iterate through the user supplied list, figure out which ones need glob and which ones don't (contains [ and ], ?, or *). Then internally search through the non-glob list first then if it doesn't match, go through the glob list. @frou do you think this would work for what you are asking for? I will be using the github.com/gobwas/glob package. This is also backwards compatable with the current structure.

frou commented 6 years ago

I get what you're saying, since one of the main reasons for golangci-lint existing is that gometalinter became a drag to use due to not having a focus on optimisation.

Though, for my tastes, I would still want to have nothing but globbing with no possibility of fallback to prefix matching. But obviously it's not right for me to try and tell you what to do on your own project.

Maybe we should put this on ice until other users get a chance to chime in. I don't know whether or not you've done much publicity for depguard, so maybe there aren't a lot of users yet. IIRC I found it from a link on some other GitHub issue!

dixonwille commented 6 years ago

Latest of master should have the fix now. Opening a PR into GolangCI-Lint for the update. The way I built it does not slow down my usage at all and allows you to have globs! Win-Win

dixonwille commented 6 years ago

https://github.com/golangci/golangci-lint/pull/179

frou commented 6 years ago

Thank you very much