Closed zmarouf closed 2 years ago
The sweet spot seems to be 50 since when I comment out everything after local/zmarouf/30 I don't get that issue anymore.
@zmarouf thanks for creating this issue. This should definitely not happen. Only think i could think of is, skaffold computes tag in a different subroutine and maybe we are hitting some concurrency limit.
here is the code which generates tags https://github.com/GoogleContainerTools/skaffold/blob/master/pkg/skaffold/runner/build_deploy.go#L157
@zmarouf
macOS version: 10.15.7
$ /usr/local/bin/skaffold version
v1.15.0
λ DOCKER_BUILDKIT=1 /usr/local/bin/skaffold build -p local
Generating tags...
- local/zmarouf/61 -> local/zmarouf/61:68e50da
- local/zmarouf/59 -> local/zmarouf/59:68e50da
- local/zmarouf/66 -> local/zmarouf/66:68e50da
- local/zmarouf/50 -> local/zmarouf/50:68e50da
- local/zmarouf/68 -> local/zmarouf/68:68e50da
- local/zmarouf/57 -> local/zmarouf/57:68e50da
- local/zmarouf/32 -> local/zmarouf/32:68e50da
- local/zmarouf/35 -> local/zmarouf/35:68e50da
- local/zmarouf/69 -> local/zmarouf/69:68e50da
- local/zmarouf/56 -> local/zmarouf/56:68e50da
- local/zmarouf/51 -> local/zmarouf/51:68e50da
- local/zmarouf/58 -> local/zmarouf/58:68e50da
- local/zmarouf/67 -> local/zmarouf/67:68e50da
- local/zmarouf/60 -> local/zmarouf/60:68e50da
- local/zmarouf/34 -> local/zmarouf/34:68e50da
- local/zmarouf/33 -> local/zmarouf/33:68e50da
- local/zmarouf/20 -> local/zmarouf/20:68e50da
- local/zmarouf/18 -> local/zmarouf/18:68e50da
- local/zmarouf/27 -> local/zmarouf/27:68e50da
- local/zmarouf/9 -> local/zmarouf/9:68e50da
- local/zmarouf/11 -> local/zmarouf/11:68e50da
- local/zmarouf/7 -> local/zmarouf/7:68e50da
- local/zmarouf/29 -> local/zmarouf/29:68e50da
- local/zmarouf/16 -> local/zmarouf/16:68e50da
- local/zmarouf/42 -> local/zmarouf/42:68e50da
- local/zmarouf/45 -> local/zmarouf/45:68e50da
- local/zmarouf/73 -> local/zmarouf/73:68e50da
- local/zmarouf/80 -> local/zmarouf/80:68e50da
- local/zmarouf/74 -> local/zmarouf/74:68e50da
- local/zmarouf/6 -> local/zmarouf/6:68e50da
- local/zmarouf/28 -> local/zmarouf/28:68e50da
- local/zmarouf/17 -> local/zmarouf/17:68e50da
- local/zmarouf/1 -> local/zmarouf/1:68e50da
- local/zmarouf/10 -> local/zmarouf/10:68e50da
- local/zmarouf/19 -> local/zmarouf/19:68e50da
- local/zmarouf/26 -> local/zmarouf/26:68e50da
- local/zmarouf/8 -> local/zmarouf/8:68e50da
- local/zmarouf/21 -> local/zmarouf/21:68e50da
- local/zmarouf/75 -> local/zmarouf/75:68e50da
- local/zmarouf/72 -> local/zmarouf/72:68e50da
- local/zmarouf/44 -> local/zmarouf/44:68e50da
- local/zmarouf/43 -> local/zmarouf/43:68e50da
- local/zmarouf/38 -> local/zmarouf/38:68e50da
- local/zmarouf/36 -> local/zmarouf/36:68e50da
- local/zmarouf/31 -> local/zmarouf/31:68e50da
- local/zmarouf/65 -> local/zmarouf/65:68e50da
- local/zmarouf/62 -> local/zmarouf/62:68e50da
- local/zmarouf/54 -> local/zmarouf/54:68e50da
- local/zmarouf/53 -> local/zmarouf/53:68e50da
- local/zmarouf/30 -> local/zmarouf/30:68e50da
- local/zmarouf/37 -> local/zmarouf/37:68e50da
- local/zmarouf/39 -> local/zmarouf/39:68e50da
- local/zmarouf/52 -> local/zmarouf/52:68e50da
- local/zmarouf/55 -> local/zmarouf/55:68e50da
- local/zmarouf/63 -> local/zmarouf/63:68e50da
- local/zmarouf/64 -> local/zmarouf/64:68e50da
- local/zmarouf/46 -> local/zmarouf/46:68e50da
- local/zmarouf/79 -> local/zmarouf/79:68e50da
- local/zmarouf/41 -> local/zmarouf/41:68e50da
- local/zmarouf/77 -> local/zmarouf/77:68e50da
- local/zmarouf/48 -> local/zmarouf/48:68e50da
- local/zmarouf/70 -> local/zmarouf/70:68e50da
- local/zmarouf/24 -> local/zmarouf/24:68e50da
- local/zmarouf/23 -> local/zmarouf/23:68e50da
- local/zmarouf/4 -> local/zmarouf/4:68e50da
- local/zmarouf/15 -> local/zmarouf/15:68e50da
- local/zmarouf/3 -> local/zmarouf/3:68e50da
- local/zmarouf/12 -> local/zmarouf/12:68e50da
- local/zmarouf/71 -> local/zmarouf/71:68e50da
- local/zmarouf/76 -> local/zmarouf/76:68e50da
- local/zmarouf/49 -> local/zmarouf/49:68e50da
- local/zmarouf/40 -> local/zmarouf/40:68e50da
- local/zmarouf/47 -> local/zmarouf/47:68e50da
- local/zmarouf/78 -> local/zmarouf/78:68e50da
- local/zmarouf/2 -> local/zmarouf/2:68e50da
- local/zmarouf/13 -> local/zmarouf/13:68e50da
- local/zmarouf/5 -> local/zmarouf/5:68e50da
- local/zmarouf/14 -> local/zmarouf/14:68e50da
- local/zmarouf/22 -> local/zmarouf/22:68e50da
- local/zmarouf/25 -> local/zmarouf/25:68e50da
Checking cache...
- local/zmarouf/61: Not found. Building
- local/zmarouf/59: Not found. Building
- local/zmarouf/66: Not found. Building
- local/zmarouf/50: Not found. Building
......
λ docker images | grep local/zmarouf
local/zmarouf/50 68e50da 7f81365233db 5 seconds ago 165MB
local/zmarouf/50 7f81365233dbfb8963e1487f6cba74fcd7221d614692c8675ae860dc16ca5823 7f81365233db 5 seconds ago 165MB
local/zmarouf/59 68e50da 7f81365233db 5 seconds ago 165MB
local/zmarouf/59 7f81365233dbfb8963e1487f6cba74fcd7221d614692c8675ae860dc16ca5823 7f81365233db 5 seconds ago 165MB
local/zmarouf/61 68e50da 7f81365233db 5 seconds ago 165MB
local/zmarouf/61 7f81365233dbfb8963e1487f6cba74fcd7221d614692c8675ae860dc16ca5823 7f81365233db 5 seconds ago 165MB
local/zmarouf/66 68e50da
......
what's your docker for mac version ? it might related to that ? ...
here is mine:
@dfang Interesting! I'll rollback and check.
Doesn't seem to fix it on my machine. Surprisingly, it hangs too.
I just checked and it happens even if the docker daemon is unavailable.
i got this if quit docker for mac
λ /usr/local/bin/skaffold build -p local
Generating tags...
- local/zmarouf/61 -> local/zmarouf/61:68e50da
- local/zmarouf/59 -> local/zmarouf/59:68e50da
- local/zmarouf/66 -> local/zmarouf/66:68e50da
- local/zmarouf/50 -> local/zmarouf/50:68e50da
- local/zmarouf/68 -> local/zmarouf/68:68e50da
- local/zmarouf/57 -> local/zmarouf/57:68e50da
- local/zmarouf/32 -> local/zmarouf/32:68e50da
- local/zmarouf/35 -> local/zmarouf/35:68e50da
- local/zmarouf/69 -> local/zmarouf/69:68e50da
- local/zmarouf/56 -> local/zmarouf/56:68e50da
- local/zmarouf/51 -> local/zmarouf/51:68e50da
- local/zmarouf/58 -> local/zmarouf/58:68e50da
- local/zmarouf/67 -> local/zmarouf/67:68e50da
- local/zmarouf/60 -> local/zmarouf/60:68e50da
- local/zmarouf/34 -> local/zmarouf/34:68e50da
- local/zmarouf/33 -> local/zmarouf/33:68e50da
- local/zmarouf/20 -> local/zmarouf/20:68e50da
- local/zmarouf/18 -> local/zmarouf/18:68e50da
- local/zmarouf/27 -> local/zmarouf/27:68e50da
- local/zmarouf/9 -> local/zmarouf/9:68e50da
- local/zmarouf/11 -> local/zmarouf/11:68e50da
- local/zmarouf/7 -> local/zmarouf/7:68e50da
- local/zmarouf/29 -> local/zmarouf/29:68e50da
- local/zmarouf/16 -> local/zmarouf/16:68e50da
- local/zmarouf/42 -> local/zmarouf/42:68e50da
- local/zmarouf/45 -> local/zmarouf/45:68e50da
- local/zmarouf/73 -> local/zmarouf/73:68e50da
- local/zmarouf/80 -> local/zmarouf/80:68e50da
- local/zmarouf/74 -> local/zmarouf/74:68e50da
- local/zmarouf/6 -> local/zmarouf/6:68e50da
- local/zmarouf/28 -> local/zmarouf/28:68e50da
- local/zmarouf/17 -> local/zmarouf/17:68e50da
- local/zmarouf/1 -> local/zmarouf/1:68e50da
- local/zmarouf/10 -> local/zmarouf/10:68e50da
- local/zmarouf/19 -> local/zmarouf/19:68e50da
- local/zmarouf/26 -> local/zmarouf/26:68e50da
- local/zmarouf/8 -> local/zmarouf/8:68e50da
- local/zmarouf/21 -> local/zmarouf/21:68e50da
- local/zmarouf/75 -> local/zmarouf/75:68e50da
- local/zmarouf/72 -> local/zmarouf/72:68e50da
- local/zmarouf/44 -> local/zmarouf/44:68e50da
- local/zmarouf/43 -> local/zmarouf/43:68e50da
- local/zmarouf/38 -> local/zmarouf/38:68e50da
- local/zmarouf/36 -> local/zmarouf/36:68e50da
- local/zmarouf/31 -> local/zmarouf/31:68e50da
- local/zmarouf/65 -> local/zmarouf/65:68e50da
- local/zmarouf/62 -> local/zmarouf/62:68e50da
- local/zmarouf/54 -> local/zmarouf/54:68e50da
- local/zmarouf/53 -> local/zmarouf/53:68e50da
- local/zmarouf/30 -> local/zmarouf/30:68e50da
- local/zmarouf/37 -> local/zmarouf/37:68e50da
- local/zmarouf/39 -> local/zmarouf/39:68e50da
- local/zmarouf/52 -> local/zmarouf/52:68e50da
- local/zmarouf/55 -> local/zmarouf/55:68e50da
- local/zmarouf/63 -> local/zmarouf/63:68e50da
- local/zmarouf/64 -> local/zmarouf/64:68e50da
- local/zmarouf/46 -> local/zmarouf/46:68e50da
- local/zmarouf/79 -> local/zmarouf/79:68e50da
- local/zmarouf/41 -> local/zmarouf/41:68e50da
- local/zmarouf/77 -> local/zmarouf/77:68e50da
- local/zmarouf/48 -> local/zmarouf/48:68e50da
- local/zmarouf/70 -> local/zmarouf/70:68e50da
- local/zmarouf/24 -> local/zmarouf/24:68e50da
- local/zmarouf/23 -> local/zmarouf/23:68e50da
- local/zmarouf/4 -> local/zmarouf/4:68e50da
- local/zmarouf/15 -> local/zmarouf/15:68e50da
- local/zmarouf/3 -> local/zmarouf/3:68e50da
- local/zmarouf/12 -> local/zmarouf/12:68e50da
- local/zmarouf/71 -> local/zmarouf/71:68e50da
- local/zmarouf/76 -> local/zmarouf/76:68e50da
- local/zmarouf/49 -> local/zmarouf/49:68e50da
- local/zmarouf/40 -> local/zmarouf/40:68e50da
- local/zmarouf/47 -> local/zmarouf/47:68e50da
- local/zmarouf/78 -> local/zmarouf/78:68e50da
- local/zmarouf/2 -> local/zmarouf/2:68e50da
- local/zmarouf/13 -> local/zmarouf/13:68e50da
- local/zmarouf/5 -> local/zmarouf/5:68e50da
- local/zmarouf/14 -> local/zmarouf/14:68e50da
- local/zmarouf/22 -> local/zmarouf/22:68e50da
- local/zmarouf/25 -> local/zmarouf/25:68e50da
Checking cache...
I tried compiling. Without a daemon, I still get latest and it hangs at random points.
❯ ../skaffold/out/skaffold version
v1.15.0-34-g84f8f3d31
❯ ../skaffold/out/skaffold build -p local
Generating tags...
- local/zmarouf/61 -> local/zmarouf/61:latest
- local/zmarouf/59 -> ^[
❯ ../skaffold/out/skaffold build -p local
Generating tags...
- local/zmarouf/61 -> local/zmarouf/61:68e50da
- local/zmarouf/59 -> local/zmarouf/59:latest
- local/zmarouf/66 -> local/zmarouf/66:68e50da
- local/zmarouf/50 -> local/zmarouf/50:68e50da
- local/zmarouf/68 -> local/zmarouf/68:68e50da
- local/zmarouf/57 -> local/zmarouf/57:latest
- local/zmarouf/32 -> local/zmarouf/32:68e50da
- local/zmarouf/35 -> local/zmarouf/35:68e50da
- local/zmarouf/69 -> local/zmarouf/69:68e50da
- local/zmarouf/56 -> local/zmarouf/56:latest
- local/zmarouf/51 -> local/zmarouf/51:68e50da
- local/zmarouf/58 -> local/zmarouf/58:68e50da
- local/zmarouf/67 -> local/zmarouf/67:latest
- local/zmarouf/60 -> local/zmarouf/60:latest
- local/zmarouf/34 -> local/zmarouf/34:latest
- local/zmarouf/33 -> local/zmarouf/33:latest
- local/zmarouf/20 -> local/zmarouf/20:68e50da
- local/zmarouf/18 -> local/zmarouf/18:68e50da
- local/zmarouf/27 -> local/zmarouf/27:68e50da
- local/zmarouf/9 -> local/zmarouf/9:latest
- local/zmarouf/11 -> ^[
Ok. I believe it's env var related. It works fine if I run:
env -i <list of Linux env vars pulled from a server> skaffold build -p local
It seems to be related to the PATH variable.
This works -> PATH=/home:/usr/local/bin
This doesn't ->PATH=/usr/local/bin
@dfang Could you please check to see if you have /home defined in your PATH variable somewhere? I feel like I'm going crazy
Ok. I believe it's env var related. It works fine if I run:
env -i <list of Linux env vars pulled from a server> skaffold build -p local
It seems to be related to the PATH variable. This works ->
PATH=/home:/usr/local/bin
This doesn't ->PATH=/usr/local/bin
i got two versions of skaffold.
λ which skaffold
/Users/mj/go/bin/skaffold
λ skaffold version
v1.15.0-41-gf6e57d06c-dirty
λ /usr/local/bin/skaffold version # installed by homebrew
v1.15.0
if you believe it's related to the PATH variable, you can unset PATH and try with full path or just remove everything in your .bash_profile
This is why I used env -i to clear the environment of the command to isolate the variable that was causing this.
My point was more that it expects to explicitly see /home in the PATH variable somewhere like PATH=/home:/usr/bin even though on macOS HOME is under /Users
Running without a PATH variable doesn't work as skaffold looks for the "sh" binary. Giving it only PATH=/bin for sh to be found also doesn't work on my machine.
Tl;dr
This works (explicit /home folder in PATH even though it's empty on macOS)
❯ env -i HOME=/tmp/ PATH=/home:/usr/local/bin /usr/local/bin/skaffold build -p local
This doesn't work (no reference to /home)
❯ env -i HOME=/tmp/ PATH=/usr/local/bin /usr/local/bin/skaffold build -p local
i don't have /home in my path
λ echo $PATH | tr ":" "\n"
/usr/local/kubebuilder/bin
/Users/mj/.serverless/bin
/Users/mj/.gvm/bin
/Users/mj/google-cloud-sdk/bin
/Users/mj/.deno/bin
/Users/mj/.cargo/bin
/usr/local/opt/curl/bin
/usr/local/opt/openssl/bin
/Users/mj/.pyenv/shims
/Users/mj/.gem/ruby/2.5.1/bin
/Users/mj/.rubies/ruby-2.5.1/lib/ruby/gems/2.5.0/bin
/Users/mj/.rubies/ruby-2.5.1/bin
/Users/mj/go/bin
/Users/mj/.gvm/bin
/Users/mj/.cargo/bin
/Users/mj/.krew/bin
/usr/local/opt/coreutils/libexec/gnubin
/usr/local/opt/curl/bin
/Users/mj/bin
/usr/local/bin
/usr/local/sbin
/usr/bin
/bin
/usr/sbin
/sbin
/usr/local/MacGPG2/bin
/usr/local/share/dotnet
/Applications/Wireshark.app/Contents/MacOS
~/bin/istio-1.5.4/bin
/Users/mj/.local/bin
/Users/mj/.fzf/bin
~/Library/flutter/bin
/Users/mj/.fastlane/bin
It fails for me if I use your PATH while changing your user id to mine.
There must be some other variable that comes into play.
Does it work if you run it with env -i
and only PATH and HOME variables?
Like so?
env -i HOME=/Users/mj PATH=/usr/local/kubebuilder/bin:/Users/mj/.serverless/bin:/Users/mj/.gvm/bin:/Users/mj/google-cloud-sdk/bin:/Users/mj/.deno/bin:/Users/mj/.cargo/bin:/usr/local/opt/curl/bin:/usr/local/opt/openssl/bin:/Users/mj/.pyenv/shims:/Users/mj/.gem/ruby/2.5.1/bin:/Users/mj/.rubies/ruby-2.5.1/lib/ruby/gems/2.5.0/bin:/Users/mj/.rubies/ruby-2.5.1/bin:/Users/mj/go/bin:/Users/mj/.gvm/bin:/Users/mj/.cargo/bin:/Users/mj/.krew/bin:/usr/local/opt/coreutils/libexec/gnubin:/usr/local/opt/curl/bin:/Users/mj/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/MacGPG2/bin:/usr/local/share/dotnet:/Applications/Wireshark.app/Contents/MacOS:~/bin/istio-1.5.4/bin:/Users/mj/.local/bin:/Users/mj/.fzf/bin:~/Library/flutter/bin:/Users/mj/.fastlane/bin /usr/local/bin/skaffold build -p local
@zmarouf
λ env -i HOME=/Users/mj PATH=/usr/local/kubebuilder/bin:/Users/mj/.serverless/bin:/Users/mj/.gvm/bin:/Users/mj/google-cloud-sdk/bin:/Users/mj/.deno/bin:/Users/mj/.cargo/bin:/usr/local/opt/curl/bin:/usr/local/opt/openssl/bin:/Users/mj/.pyenv/shims:/Users/mj/.gem/ruby/2.5.1/bin:/Users/mj/.rubies/ruby-2.5.1/lib/ruby/gems/2.5.0/bin:/Users/mj/.rubies/ruby-2.5.1/bin:/Users/mj/go/bin:/Users/mj/.gvm/bin:/Users/mj/.cargo/bin:/Users/mj/.krew/bin:/usr/local/opt/coreutils/libexec/gnubin:/usr/local/opt/curl/bin:/Users/mj/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/MacGPG2/bin:/usr/local/share/dotnet:/Applications/Wireshark.app/Contents/MacOS:~/bin/istio-1.5.4/bin:/Users/mj/.local/bin:/Users/mj/.fzf/bin:~/Library/flutter/bin:/Users/mj/.fastlane/bin /usr/local/bin/skaffold build -p local
Generating tags...
- local/zmarouf/61 -> local/zmarouf/61:68e50da
- local/zmarouf/59 -> local/zmarouf/59:68e50da
- local/zmarouf/66 -> local/zmarouf/66:68e50da
- local/zmarouf/50 -> local/zmarouf/50:68e50da
- local/zmarouf/68 -> local/zmarouf/68:68e50da
- local/zmarouf/57 -> local/zmarouf/57:68e50da
- local/zmarouf/32 -> local/zmarouf/32:68e50da
- local/zmarouf/35 -> local/zmarouf/35:68e50da
- local/zmarouf/69 -> local/zmarouf/69:68e50da
- local/zmarouf/56 -> local/zmarouf/56:68e50da
- local/zmarouf/51 -> local/zmarouf/51:68e50da
- local/zmarouf/58 -> local/zmarouf/58:68e50da
...
Ok. Things are getting weirder.
The real issue is the fact that the tagging routine seems to fork once per image
I see 80+ occurrences of git rev-list -1 HEAD --abbrev-commit
and 50+ of git status . --porcelain]
On my current session and with the number of containers I'm running this leads to file descriptor exhaustion
DEBU[0001] generating tag: unable to find git commit: pipe: too many open files
fork/exec /usr/local/bin/git: too many open files
fork/exec /usr/local/bin/git: bad file descriptor
DEBU[0001] Using a fall-back tagger
So I guess one solution would be to compute the tag once per build run since the tag policy is shared by all artifacts?
No clue why this happens without /home and not with /home in my PATH though. That's still a mystery to me.
perhaps you're running into this?
Perhaps the cause for confusion is that we fallback to the latest
tagger if the original tagger fails. Maybe we should error out if the tagger is explicitly set. I think we currently do this because git
tagger is the default and we don't want to fail the operation if someone doesn't define any tagger explicitly.
perhaps you're running into this?
Definitely sounds like it. I'll try tweaking those settings and I'll figure out a way to monitor this on my machine. It was a bit of a head-scratcher :)
What do you think about computing the tag only once instead of per image? Correct me if I'm wrong, but we currently can't override the setting for individual images so it would make sense to just do that work once?
i didn't change these limits,
~ λ ulimit -n
256
~ λ ulimit -u
2784
~ λ launchctl limit maxfiles
maxfiles 256 unlimited
What do you think about computing the tag only once instead of per image? Correct me if I'm wrong, but we currently can't override the setting for individual images so it would make sense to just do that work once?
I'd have to take a look at how GitTagger exactly works but I'm guessing the reason it computes the tag for every artifact is that those artifacts could be in different git repos (perhaps quite unlikely)?
The git tagger's *TreeSha
variants are computed on a per-artifact location.
Expected behavior
skaffold build with an AbbrevCommitSha tagPolicy tags all the images with the abbreviated commits.
Actual behavior
Some of the images get tagged with "latest" when there are more than 50 images to process.
Information
Steps to reproduce the behavior
skaffold build -p local