Open dlmorais-pbh opened 4 years ago
@dlmorais-pbh binaries are not even slim (https://github.com/hashicorp/vault/pull/7752)... I've made changes to get the binary client-only to 30mg if you're interested (haven't bothered to clean and open pull request as the "no symbols" one never got feedback)
I'd like to reiterate the need for this. I came here to submit a very similar issue for the exact reasons @dlmorais-pbh.
I also love the idea of a separate vault agent binary (and possibly published docker image).
Another use case is that for Enterprise binaries where the license is embedded in the binary itself, a CLI only binary will help with controlling who sets up a Vault server.
@fopinappb do you mind sharing your changes? I would greatly appreciate that
@scabala I'll try to create a PR with only those changes, as the modified CLI I'm using has quite a few other changes (that mean nothing upstream).
Our use case is a terraform docker image in a CI pipeline which is using vault to setup the initial environment. Although one could argue that a CI pipeline should use the API but please note that unless you are working with vault full time, it is really hard to memorize even one way of working with Vault.
Another user/team here missing a CLI-only binary that is not 150mb in size for our toolset Docker images :)
Hey folks what about vault-cli? - that seems just a python wrapper assuming you have python. Out of interest @dlmorais-pbh how did you proceed or did you try that vault-cli out?
If a vaultcli
specific binary was provided then it could also include a man
page (as noted already on #6472) as well as other additional links to the website if fitting all this into the current project repository is a challenge.
There could be a fork or module of the main repo with all unneeded dependencies stripped as well as UI, server, audit & agent specific portions of the code excluded to build a minimal vaultcli
and with additions for extra man page to bundle as well as other reference material.
Hi @aphorise, first of all, I did not try vault-cli because it is not official, and in my specific case the managers do not want a non-official tool to be used to read secrets.
As for how did I proceed, I created a shell script that performs the steps I need using Vault's HTTP API. It is very lightweight, of course, but limited only to the few scenarios used in my continuous integration.
Ps.: Here is the link to the discussion that led to opening this issue: https://discuss.hashicorp.com/t/vault-cli-without-server/16222
Hi @aphorise , thanks for keeping up with this issue.
Like @dlmorais-pbh , we also didn't try vault-cli. In our case, we need the Vault CLI tool to be inside small Docker images that do not already use Python. Therefore, adding Python as a dependency bloats the image up and thus, in that case, we better then just use the big official vault
binary as vault-cli + Python defeats the purpose of small images.
Also, now that @dlmorais-pbh mentions it, we also would highly prefer an official solution for security/trust reasons.
We also are considering, for now, on simply implementing a minimal/tailored HTTP API tool based on curl
+ shell scripting.
But yes, this solution does not scale well to new use cases and is subject to break in the future if the API changes.
Hi, Another use case I would have is to be able to talk to a distant vault server during the init phase of VM boot to be able to fetch secrets needed later on.
I'm also currently resorting to shell scripts.
132.1 Mo is very huge executable for a CLI !! Please release 2 binaries: one server & the other client(CLI). Not simpler than that!
Opt in to 2.8 KB a CLI built with nodejs (instead of 132Mo) because I just need two API calls ( login with kubernetes method + pull secret). Thanks node-fetch
!
Opt in to 2.8 KB a CLI built with nodejs (instead of 132Mo)
Most use cases can probably make it with simple shell scripts and curl. Nodejs and python runtime dependencies are definitely not an answer for a small CLI (unless your container already requires them) 😃
Also, a small rust or Go binary would be enough, but IMO the point of this issue is to have that maintained by hashicorp with the official CLI...
I believe producing a CLI only based on the current code base may be a challenge unless there's some related overhaul.
For my own curiosity what I tried was commenting-out as well as renaming related .go
files and code block affiliated with agent
, server
, operator migrate
, operator diagnose
(within command/commands.go).
However that build resulted in only a -18Mb reduction in size that was 176 Mbytes vs the regular build that includes everything which is: 193 Mb
I was curious if my earlier suggestion with fork / git-moulding could work and to build a stripped down / minimal CLI only using the same code sources.
As suggested earlier - I believe a new or separate CLI only wrapper would need to be pursued as it currently stands.
@ncabatoff - do you have any tips or suggestion on weather a slim CLI only build can be attainable from the current code base and one that could possible produce a binary < 20/30 Mbytes for example?
@aphorise at the time of my first comment I did manage to get down to 30mb https://github.com/hashicorp/vault/issues/10180#issuecomment-714314384
binary size comes from all the dependencies so you can’t just remove some commands. I removed most (all?) dependencies from authentication libraries (AWS, azure, etc), those are the big fat ones.
But it did require some refactoring to be able to compile both ways using build tags. Honestly, I haven’t opened a PR with it as i have other PR with almost no review required open for years, so it’s pointless.
and maintaining a fork is hard due to that refactor for tags (tree conflicts on every rebase)
maybe a detached fork would work, though
at the time of my first comment I did manage to get down to 30mb #10180 (comment)
Hey @fopina thanks for highlighting your earlier efforts. I believe that since your last attempts at this in older Go versions back then, setting LD_FLAGS="-s -w"
currently does not reduce the binary as much as before as I believe a lot of this optimisation is now happening in Go 1.18.+.
Previously I was doing make dev
& repeated the same process as my last comment with the commented blocks and files and applying LD_FLAGS="-s -w"
which resulted in total 144M
so a good 32Mb reduction. I believe in the case of a CLI build if the current code base could be used then doing builds specifically with these flags could be valid (but not in the case of the regular vault
cli with server, agent, etc).
@aphorise getting the binary down to 30mg was not from stripping symbols. It was from removing many dependencies and isolating those in many _CLI.go and _ALL.go files só both versions would be supported, as I mentioned.
I did have an open PR related to stripping symbols as it had no impact and “some” gain, but not related to this issue.
binary size comes from all the dependencies so you can’t just remove some commands. I removed most (all?) dependencies from authentication libraries (AWS, azure, etc), those are the big fat ones.
Am I right in thinking a lot of those authentication libraries are needed for several CLI use cases (as well as vault agent). It sounds like we may be able to achieve a smaller general vault binary by reworking those auth flows to not require such big dependencies. In that case, we wouldn't need to split out a CLI build.
@fopina are you able to revisit this again and provide a draft PR to demonstrate?
It would be good to see exactly what you did and the same approach may be translatable to a fork / separate repository if those steps can then be script / automated.
@seandilda yes, I believe they would still be used by the CLI for oauth-alike authentication flows which I wasn't using at the time. And 💯 on that: moving away from the huge SDKs would very likely allow the full build to be more portable...
@aphorise I did those changes in an internal fork in a previous job, I don't have the code anymore. But it took me a while (few days) to get it working and both initial refactor and rebasing were not easily scriptable.
Huge blocks of code were nested under conditions and refactored to avoid duplication of code (between _ALL
and _CLI
builds) which led to many conflicts on every rebase.
I can't re-do the same work as I think it's pointless (if it's not merged, it's hard to maintain), but I can start a CLI-only fork and try to apply same logic (without caring about supporting server build) and see where it leads. At least rebases would be simpler there (no tree conflicts).
I'll drop it here if I manage to get to it!
Chiming in here: we now set LDFLAGS="-s -w"
by default, which should reduce the binary size a little.
We don't have any specific plans on releasing a CLI binary right now. It's something I've thought about a lot, and I might bring up something with our team to make it easier to build a CLI-only binary.
FYI I have done some experiments similar to those mentioned in the comments here. If you're willing to be very aggressive with dependency pruning (removing things like the Kubernetes client) and using upx
, it's possible to get a Vault CLI binary down to < 10 MB in size :)
@swenson if someone would open a PR isolating every "fat" dependency with build tags but keeping the default build output as is, would it be considered for review?
Even if the possible tag combinations would not be distributed in this repo (not really feasible) it would allow people to easily build variants with the dependencies they do require
@fopina I would absolutely look at such a PR.
Some gotchas too look out for that are slightly harder to refactor into build tags:
But it may not be so bad to, say, factor out vault server
and vault agent
with a +noagent
or +noserver
tag (or something similar).
@swenson that was my first move back then but I remember the size reduction was too small to care for the (little) extra complexity.
But it could be a first baby step indeed, I'll tag you then if I get to it 😀
@swenson thanks for mentioning upx
which I had not thought of at all. Using my last build (dev
- good candidate as it excluded UI) - I was able to get the 144 Mb binary down to 24!
upx --best --lzma vault
Ultimate Packer for eXecutables
Copyright (C) 1996 - 2020
UPX 3.96 Markus Oberhumer, Laszlo Molnar & John Reiser Jan 23rd 2020
File size Ratio Format Name
-------------------- ------ ----------- -----------
150307488 -> 25923600 17.25% macho/amd64 vault
Packed 1 file.
I'd bet there are smarter ways to half the original 144 megs to 70 or lower and a subsequent upx
may be able to further it to under 10mb. Using UPX on the current public binaries (with UI) also drops it to under 69Mb - anyway it's good to know that using the same code base + upx
it is possible to achieve a binary that's under < 30mb
@aphorise just as note, in case you’ve not used upx often (or packers in general), is that the unpacking does come at a cost.
for CLIs (that expect short run times and always terminate), that might be noticeable, specially in low end devices (such as Pi2 or alike). You do get the binary loaded from disk faster (as it’s smaller), so that might compensate a bit.
but my perception in the cloud world though is that disk/network transfer is much much cheaper than memory and CPU
Is your feature request related to a problem? Please describe. I work as DevOps engineer and I’m currently creating some utility OCI images to use at our DevOps workflows. One thing that a lot of different images need is the Vault CLI. And I realized that the only release option is the full solution (CLI+Server), and it’s “big” (~150mb), given that my utility images are Alpine based and usually around ~10-15mb.
Describe the solution you'd like It would be nice to have a release of a Vault CLI without any extra features, to make it as small as possible.
Describe alternatives you've considered To solve my problem I currently use a DIY shell script capable of read the secrets I need, currently only KV2 engines. But I might need to extend it eventually. I was also introduced to a external tool created by the community that tries to solve this problem, but I think that a tool to perform such security driven task should be official.
Explain any additional use-cases I think that it's possible that the Vault Agent could also benefit of this feature. Maybe in the end having 3 different types of release: