Open jjmaestro opened 1 week ago
There is no cache, but if the glob rule you specify matches a lot of files, that'll take some time, simply because the way it does glob matching is pretty derpy (basically it just says "get me all the files that match the glob", and then looks if the list is empty or not). There's also no exclude rule, because I didn't consider that someone would have workflows with paths that matched huge local caches.
There's several glob-related issues already open; if you're feeling energetic, you could do everyone a solid and replace the existing glob implementation with something that is (a) not entirely derpy, and (b) matches the GitHub glob rules more precisely.
There is no cache
😮 Then, it's truly bizarre!
but if the glob rule you specify matches a lot of files, that'll take some time, simply because the way it does glob matching is pretty derpy (basically it just says "get me all the files that match the glob", and then looks if the list is empty or not).
The weird thing is that the files that show in fs_usage
as being stat
ed and open
ed by action-validator
are not in the repo!! They once were but are not there anymore... that's what I found so weird! And if I manually move the Bazel cache folder that's outside the repo, everything is fast again. It's somehow managing to find it from something in the repo, I guess?
Or maybe I'm missing something and they are somehow still in the repo but I doubt it, I've run find | grep
and couldn't find the paths that fs_usage
was showing. Also, clean clones of the same repo don't show this issue 😅 It's all quite weird, to be honest.
I'm not that proficient at debugging in MacOS... I'll try to find out more about what's happening!
There's also no exclude rule, because I didn't consider that someone would have workflows with paths that matched huge local caches.
I see, that's expected I guess :) Maybe that's something that it's not too hard to add... I've never programmed in Rust and I'm pretty busy at the moment so I'll file it on the "maybe" list 😅
Related, how does this work / interact with pre-commit
re. the files it'll try to match against? Also asking because pre-commit
does have an exclude
and files
directives that maybe could mitigate this :-?
How would it be an issue running the check from a subdirectory? You mean if e.g. you run action-validator ../../.github/workflows/foo.yaml
?
There's several glob-related issues already open; if you're feeling energetic, you could do everyone a solid and replace the existing glob implementation with something that is (a) not entirely derpy, and (b) matches the GitHub glob rules more precisely.
Yeah, I saw #27 when I was checking if a similar issue was previously reported. I've never programmed in Rust, so I don't know how hard it would be... but I promise I'll give it a thought! :)
It's somehow managing to find it from something in the repo, I guess?
Thinking about the problem, I'm not coming up with a situation in which action-validator
would look outside the repo, unless the workflow tells it to -- say if the glob patterns had extra ../
in them. Without a minimal example demonstrating the problem, though, I'm just wildly speculating.
Related, how does this work / interact with pre-commit re. the files it'll try to match against? Also asking because pre-commit does have an exclude and files directives that maybe could mitigate this :-?
I don't use pre-commit, and have no idea how it works. Probably those directives manipulate which files pre-commit
itself will trigger the action for, so it wouldn't help in this case, but :shrug: I've been known to be wrong before. Give it a go and find out, I guess.
How would it be an issue running the check from a subdirectory? You mean if e.g. you run
action-validator ../../.github/workflows/foo.yaml
?
Yes, if you ran that command, action-validator
will try to resolve all the globs relative to the current working directory, not the root of the repo, so if your checkout is in repo.git
, and your working directory is repo.git/foo/bar
, with a glob of wombat/*
it'll go looking for files matching repo.git/foo/bar/wombat/*
, rather than repo.git/wombat/*
. It'd be possible to fix that, but it'd be a bit fiddly, and as I mentioned in the README, the primary use case for action-validator
is "run it in CI", where it's much easier to reason about the working directory, so I didn't bother to account for that corner case.
Hi!
I have an extremely weird bug!
When I run
action-validator
on a workflow that haspaths
withglob
, depending on where the repo is,action-validator
takes forever to run. That is, depending on WHERE the repo is cloned,action-validator
takes a huge amount of time or it validates the workflow immediately.I've peeked into what it's actually happening and... it's matching stuff against a ton of files that are OUTSIDE the repo!? So, when the cloned repo is in a path that has quite a bunch of stuff (e.g. 40K files taking ~2GB) it spins the CPU to 100% and takes forever to finish (many times I just Ctrl-C it).
BUT!!! if I clone this repo, checkout the branch and run
action-validator
again... it's immediate!?? 🤯So... what's going on?? I'm using Bazel and that usually generates A TON of files and stuff in
bazel-*
directories within the repo.These are the paths that are showing up when I do
sudo fs_usage -w PID
and check this out:So, somehow,
action-validator
is going over all the files in the Bazel cache, even when I've deleted thebazel-*
symlinks that point at it 😮Is there some cache somewhere that's forcing it to validate old paths that were previously in the repo? That is, the
bazel-*
symlinks where in the repo before, and I deleted them later when I was testing stuff.P.S. Also, when I remove the
glob
s from thepaths
inpage.yaml
, regardless of where the repo is and/or the Bazel cache, it validates the file instantly.P.S.2 Is there a way to exclude files / paths? So that I could add
bazel-*
to a config file with an exclude list or something.