phylum-dev / cli

Command line interface for the Phylum API
https://phylum.io
GNU General Public License v3.0
99 stars 10 forks source link

Double lockfiles for Go projects #1421

Closed kylewillmon closed 2 weeks ago

kylewillmon commented 2 weeks ago

Since the addition of the gomod parser in 7eb96a01, Phylum now detects both go.mod and go.sum as lockfiles. For example:

> phylum status
Project: null
Group: null
Project Root: null
Dependency Files:
 - path: ./go.mod
   type: gomod
 - path: ./go.sum
   type: go

Users can override this with a .phylum_project file, but this default behavior is not good for anyone. This will also impact the GitHub App, which is likely where most users will run into this issue.

kylewillmon commented 2 weeks ago

As far as I can tell, there are three options:

  1. Modify phylum_lockfile::find_lockfiles() and friends so that the gomod parser is ignored. This would mean that users would only get that parser if it is specifically requested (by command line options or .phylum_project file).
  2. Find some way to make go.mod the "primary" lockfile and only use go.sum if parsing go.mod fails.
  3. Remove the gomod parser entirely.
cd-work commented 2 weeks ago

I like 1), but I'd make it a little smarter than suggested. In our depfile logic we already have precedence for conditionally filtering out certain files, we could do the same with the go.mod file. That said, I'm not sure if this extra complication would be useful considering that both files are pretty much always present in any Go project from what I understand?

kylewillmon commented 2 weeks ago

In our depfile logic we already have precedence for conditionally filtering out certain files, we could do the same with the go.mod file.

This is referring to our handling of Python files, right?

I had forgotten about that code... It might be a bit more annoying to do something like that for Go because go.mod and go.sum are both considered lockfiles. So in the current code, it would require special code in both find_lockfiles_at() and find_depfiles_at().

That said, I'm not sure if this extra complication would be useful considering that both files are pretty much always present in any Go project from what I understand?

This is also my understanding. I have not seen any Go project where only one of the two files is present.. Although we may need to double-check Go workspaces since I believe that have multiple go.mod files but only a single go.sum file...

Even so, preferring go.sum in all situations would resolve this issue.

ejortega commented 2 weeks ago

According to the docs. No, go.sum is not a lock file. The go.mod files in a build provide enough information for 100% reproducible builds.

However it's also stated Typically your module’s go.sum file should be committed along with your go.mod file.

If someone clones your repository and downloads your dependencies using the go command, they will receive an error if there is any mismatch between their downloaded copies of your dependencies and the corresponding entries in your go.sum.

So it seems more likely that we'll see a go.mod file in a project than a go.sum. However it does seem like most projects include both.

cd-work commented 2 weeks ago

So it seems more likely that we'll see a go.mod file in a project than a go.sum. However it does seem like most projects include both.

The reason why Kyle is suggesting to use only go.sum is likely because go.mod is not guaranteed to work. We need a specific version for it to work so we can't really recommend it by default.