cue-lang / cue

The home of the CUE language! Validate and define text-based and dynamic configuration
https://cuelang.org
Apache License 2.0
5.02k stars 286 forks source link

cmd/cue: consider a fmt mode to format all CUE files recursively #3065

Closed mvdan closed 4 months ago

mvdan commented 4 months ago

Splitting off from https://github.com/cue-lang/cue/issues/1731, which was quite broad and got solved by @NoamTD implementing --check and --diff as well as a bunch of other improvements.

gofmt from upstream Go is separate from go fmt - the former works on files and directories, and the latter on packages. So, for example, gofmt -l -w . is pretty similar to go fmt ./..., with the exception that the latter will only format the packages matching the ./... pattern. So if I'm in a monorepo with many Go modules, then the file-based first command can be more powerful. Similar situations happen if I want to keep all files formatted, even those behind "hidden" directories like testdata.

CUE only has cue fmt, which is package-based. It has flags like --check and --diff now, so it's pretty flexible, but I think it's still missing a mode where it simply walks directory trees and formats all CUE files it finds, regardless of package patterns and boundaries.

I can think of multiple ways to go about this:

1) Add a flag like cue fmt --files or perhaps --recurse, which would make the tool interpret any arguments as file or directory paths to recurse into rather than package patterns. 2) Overload the argument syntax much like cue help filetypes, e.g. cue fmt files: ., as a different form of 1. This one is cute, but probably too confusing for end users, and might just make the filetypes logic even more hard to grasp. 3) Interpret arguments as directories to recurse into if they have a trailing slash or filepath.Separator, which is something that some other tools already do, e.g. search for "trailing slash" in https://linux.die.net/man/1/rsync. This is likely the shortest solution, but perhaps the most surprising, as today cue export ./internal/ci/github and cue export ./internal/ci/github/ mean exactly the same.

I personally lean towards option 1, as options 2 and 3 are likely to cause confusing among users.

mvdan commented 4 months ago

I should clarify that one option is to change cue fmt in a breaking way where it always interprets arguments as file paths rather than package patterns. However, this is likely to break lots of users, particularly given that arguments like ./... or ./foo/... would stop working, and cue fmt . would now start working recursively when before it wasn't.

NoamTD commented 4 months ago

First off, I'll admit that even as a relatively advanced cue user, before delving into the cue fmt implementation I ran cue fmt ./... assuming that it was running in the "file" mode you described - I think in an ideal world there would only be a file mode, but as you said this is a breaking change.

I agree that best/least surprising option would be 1 - it's explicit and doesn't require the user to have any fancy knowledge.

One thing that I think should be considered is the topic of files like .gitignore - we probably would want to ignore files excluded by vcs.

NoamTD commented 4 months ago

also, perhaps from a UX perspective, it would make sense to warn/fail in case the user mistakenly runs something like cue fmt --files ./...

mvdan commented 4 months ago

I think in an ideal world there would only be a file mode, but as you said this is a breaking change.

I wouldn't go that far to be honest - formatting packages only can still be useful, for example if your testdata directories have files badly formatted on purpose, or have tons of files and you're simply not interested in formatting them. But I agree that by default I almost always want "file mode".

One thing that I think should be considered is the topic of files like .gitignore - we probably would want to ignore files excluded by vcs.

cue mod publish does support that sort of thing to decide what files to publish as part of a module (https://github.com/cue-lang/cue/issues/2992), but otherwise none of our other commands (nor gofmt) care about gitignore-like logic. That said, I kinda agree that we shouldn't blindly recurse into every single directory and file. For example, gofmt does and I found it to be wasteful, so I taught my fork to stop at testdata and vendor dirs.

I would just follow the logic of cue help inputs to be honest - note that we don't have testdata or vendor dirs in CUE at this point:

Directory and file names that begin with "." or "_" are ignored, unless explicitly listed as inputs.

it would make sense to warn/fail in case the user mistakenly runs something like cue fmt --files ./...

Yes, we could and should make cue fmt --files fail if any argument contains ..., because that's most definitely a mistake.

NoamTD commented 4 months ago

I Agree with the above points, makes sense

NoamTD commented 4 months ago

I have an initial implementation based on our above discussion, looks like it's working well (+ it runs much faster than "package" mode)

but it made me think: how would we want to handle cue.mod? running cue fmt --files --check ./... displayed files in cue.mod/gen, I'm assuming that isn't ideal.

mvdan commented 4 months ago

Right, we should probably make it ignore cue.mod directories entirely. cue.mod/module.cue is formatted/canonicalized by cue mod tidy. The pkg and gen directories aren't written by humans, so we shouldn't touch them. usr perhaps we should format, but it would be odd to just do that one.