Open akx opened 6 years ago
CC @mayurid @troydai @johanste @lmazuel, this is known issue. We did some improvement for windows installer, but same kind of improvement can be done on Linux as well.
Thank you for the feedback. You made a great point. We will definitely look into this. I will keep this issue open and assign it to myself.
@marstr, could you please prioritize this work and get it done sometime in April?
I'm excited to take care of this :)
@marstr Thanks for picking this up, I've just been through another painfully slow upgrade (to 2.0.60 under WSL/Ubuntu) so I can assure you this fix will be welcomed!
Geat that this is finally getting looked at!
Every time I update the CLI, I kiss goodbye to my machine for ~20 minutes. I dread updating it.
Bundling Python with the CLI seems like madness
Looking forward to these changes
20 minutes? Yikes. Which platform are you using @benc-uk? Maybe Ubuntu via WSL?
Yes. A very common configuration I find at customers and partners. First it's the actual update and then Windows Defender going absolutely crazy
I've had some snide comments from customers about the CLI in this regard
All the more reason to get this taken care of. Thanks for the honest feedback.
The WSL file system being slow when transfer large number of files were the problem when I looked at it months ago. Windows Defender is more about the cold start time of the command.
It's a pipe dream at this point as I can see the amount of work that's gone into this Python version of the CLI, but...
A single executable (i.e. based on golang) would be ideal. I've been working with Kubernetes, Helm and Terraform. Their tooling is written in Golang and it's really nice just having this single static binary you can put anywhere
EDIT: the new AzCopy V10 has gone down this route
If you look into my GitHub profile, you'll see that I'm a huge fan of Go and have spent a non-trivial part of my career on/with it. I love Terraform and k8s, and am envious of their ease of distribution. TBH, there are a lot of benefits to Go, but with a project like ours where we need to support many many contributors across the company, asking people to learn Go is a much heavier lift than asking them to use/learn Python. Not to mention, I don't think this product is at burn-it down and rewrite quality just yet ;)
I totally understand, that's why I prefixed my comment with "this is a pipe dream" 😄 Having done some Go myself it's quite the context switch from other languages
Hopefully some tidy up of the packaging is all that is required to get things speedy again
@marstr Will slimming the .deb
package finally remove the bundled Python interpreter and finally rely on the system Python? This will affect install instructions.
Yes! That's what I'm working on right now.
Any update? Just updated my CLI again under WSL and got the same maxed out machine for ~20 mins, strangely enough I wanted to use my machine for other things at the same time but it was churning too bad
This got side tracked a bit (sorry, should have updated this thread.) There was some debate internally about whether or not taking a dependency on Python instead of bundling our own would be sacrificing any stability. We've pushed past that now, by agreeing that this approach has been working fine for Homebrew. While that debate was happening, my focus shifted to other projects.
However, the good news is that I"m scheduled to work on this again soon!
The teams getting shuffled, and I'm moving on to a different team! However, I've met with @zikalino and showed him how important this issue is, and how much interest we should be taking in fixing this. I've re assigned this issue to him :)
I ended up to this bug after I was surprised about how long it was taking to install azure-cli and also annoyed by the fact that it caused some conflict with other tools (including azure python sdk itself!). Apparently you cannot install azure
and azure-cli
without generating conflicts... something that can easily be prevented with a simple pip check
at the end of the build process.
All the kudos for @akx for the way he highlighted the problem. That issue alone has the potential of becoming a meme about how not to package something.
Since issue #9665 was closed and this issue was marked as successor issue - is there any outlook when failing installation bug might be resolved?
Is this something that you'd be happy for someone to pickup and PR? Any guide on how complex the resolution is for someone who did one to pick this up?
Any outlook when azure-cli installation issue (original item #9665 ) might get implemented?
@krojzl i just found the following comment in the old thread. I hope, its useful for you. https://github.com/Azure/azure-cli/issues/9665#issuecomment-559673532
Following up on this. I'd love to see azure-cli packages that work with the system Python. Is this still being worked on?
Hello,
FYI, azure-cli (and the devops extension) are now available natively in Debian unstable (will be added to buster-backports as soon as policy allows) and the soon-to-be-released Ubuntu 20.04 LTS:
https://packages.ubuntu.com/focal/azure-cli https://packages.debian.org/sid/azure-cli
This has been done following policy&best practices. Each dependency is packaged separately, and the system's python installation is used rather than vendorizing everything including the interpreter and its dependencies.
this is planned in Mn as lightweight installer.
Hi guys, is there any progress on it? I'm preparing an alpine builder image for ci and run in this and https://github.com/Azure/azure-cli/issues/13028 issue.
Size The most important pain point is the size. The image grows to 1.7 GB. But even official image is incredible HUGE:
mcr.microsoft.com/azure-cli 2.25.0 249845e13ca3 11 days ago 1.04GB
I mean, c'mon, 1GB for a cli? It's two times bigger than windows xp image. I expect it to be around 5mb or less.
Time Installing from pip takes around 10 minutes on localhost and includes compiling, wheels and so on. On CI it takes around 5 minutes because of better network. But still relates to the size.
My wish
Prepare appropriate executables. Don't force us to compile and install random things like python, pip, make etc. and fight them. It's fragile and painful. It's horrible for users of cli. Here is an example of how easy it can be: https://github.com/docker/compose/releases :
RUN curl -L "https://github.com/docker/compose/releases/download/1.29.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
RUN chmod u+x /usr/local/bin/docker-compose
. And it's still a python-based cli.
I'm not sure how it can grow to such size. Just imagine 1GB. How many lines of code it is?
// offtopic Why go? Why not rust? Why not typescript? ...why not still python, but better? // offtopic2 I can't image how people can love go... or dart...
The size of Azure CLI is due to the hugeness of underlying Azure Python SDKs (https://github.com/Azure/azure-sdk-for-python/issues/11149). If you install from pip
or official packages, you will install all underlying Python SDKs that include all API versions, even though most of them are historical versions and unnecessary.
There was indeed an attempt to remove unnecessary API versions (#17816), but it only removes unused API versions from azure-mgmt-network
SDK for Windows installer (MSI).
We are working with Azure Python SDK team to see how we can reduce the size of Azure Python SDKs, so that Azure CLI can lose some weight.
Hey, is there any update about the issue?
@jiasli is this still on the high priority list? I didn't see this on that board anymore.
Issue is now three years old so it would be great to have an update on the current timeline.
The azure-cli
package itself is pretty small, if you check it on PyPI: https://pypi.org/project/azure-cli/#files
As provided in https://github.com/Azure/azure-cli/issues/7387#issuecomment-879529560, the size of Azure CLI is caused by the dependency on Azure Python SDKs. We will consider trimming the SDKs and removing unused API versions, but this is not one of the top-priority tasks.
The
azure-cli
package itself is pretty small, if you check it on PyPI: https://pypi.org/project/azure-cli/#files As provided in #7387 (comment), the size of Azure CLI is caused by the dependency on Azure Python SDKs. We will consider trimming the SDKs and removing unused API versions, but this is not one of the top-priority tasks.
Great, thanks for the response!
I checked with ncdu
tool on Ubuntu, the azure-mgmt-xxx
SDKs now takes 613.7MB, while azure-cli
only takes 23.9MB:
Amongst them, azure-mgmt-network
is still the largest:
Could you vote on https://github.com/Azure/azure-sdk-for-python/issues/11149 so that Azure SDK team can get more attention? Thanks.
@jiasli
The
azure-cli
package itself is pretty small, if you check it on PyPI: https://pypi.org/project/azure-cli/#files
The point that that package itself is small is still moot (and somewhat disingenuous) if you can't properly install a "modular" azure-cli.
You can botch together something like
$ pip install --no-dependencies azure-cli
$ pip install azure-cli-core msrestazure packaging
to get a skeleton of Azure CLI (at the small, small size of 74 megabytes of installed data) that throws warnings and errors but technically starts, and then hope to start installing some azure-mgmt-*
modules to get to a working version, but that's surely not how things are supposed to be?
We have written a script that deletes the unreferenced older APIs. It does delete a good chunk of the API folders but not all of it. With the script the Azure directory is now just under 300MB instead of over 700MB. It is compatible with most, but not all, third party packages, as long as they do not point to a version that is trimmed.
Over 3 years since this ticket was opened. And yet every time I need the azure command line tools in my CI/CD pipeline, I'm spending a gigabyte and more than a minute of build time on this monstrosity. Some way of selectively installing the bits you need would be nice...
The following NEW packages will be installed: azure-cli ... After this operation, 1031 MB of additional disk space will be used.
I have been working on setting up ci/cd which needs only to push docker image to azure container registry and restart an app... and really surprised to see 1.5GB+ required for the cli for such minor tasks. Perhaps one can use some REST api directly and build some really small utility for specific tasks but not sure how to do that?
Perhaps one can use some REST api directly and build some really small utility for specific tasks but not sure how to do that?
This (use some REST api directly) is exactly what we are investigating in the Microsoft Graph migration (https://github.com/Azure/azure-cli/issues/12946). If it turns out to be working well, we may incorporate this method in all future command modules or code-gen'ed modules.
I was stunned to find that my diagnostic Docker containers grew by over 1GB when I installed the Azure CLI. This is ... astonishing.
Appreciate the work going on to slim down the install! In case it's of interest, our use case is to run it inside AWS CloudShell, which has a home dir (the only writable bit) size limit of 1GB, and we only need enough of the CLI to support az login
.
I don't think there's anyway out of this now, beyond a full rewrite in Go, Rust or Dotnet etc, something that can compile to a single binary
Each week that passes the CLL gets larger and larger, and the chances of a rewrite smaller and smaller
Where does it stop? When the CLI takes up 2GB? 5GB? 50GB?
@benc-uk Have you tried https://github.com/clumio-code/azure-sdk-trim ?
I wrote this a while back to delete older API directories that are obviously superfluous in 99% of cases and the site-packages/azure
directory goes from 846.6 MB to 348.1 MB for us. That's not perfect, it is still ridiculously large, but until the Azure developers actually do something about this problem it does help quite a bit.
If you call it from a Dockerfile make sure to have it in the same RUN
call as when the Azure cli/sdk is installed or you will still get the overhead in the layers.
@benc-uk, If you do an ncdu
, you will find Azure CLI itself (/cli
, 28.6 MB) is pretty small:
It is actually the Azure Python SDK (/mgmt
, 676.7 MB) that makes Azure CLI huge. See https://github.com/Azure/azure-cli/issues/7387#issuecomment-879529560 for the explanation.
Rewriting in Go, Rust, .NET won't give too much benefit as those Azure SDKs are equally huge.
Agreed, but with a compiled language you ship a binary, you don't ship the SDK with it!
If I wrote a Go app that used the Go SDK for Azure, used a few functions from the SDK, and compiled it - it would be completely standalone executable and only a few megabytes
I still want to see a version of Azure CLI that can be composed of different components more easily, but I've written a blog post that can hopefully help some people out: https://usrme.xyz/posts/how-to-trim-a-container-image-that-includes-azure-cli/. Using the methods described there I was able to slim an image from 1.17GB to 307MB!
I was trying to find a cli mainly for azure devops. Installing the cli literally (via pip) shocked me. My usage is just to manage repositories & pull requests.
I understand the installer itself is pretty small, but having 1G of dependencies is not ideal, at-least for my use case
Isn't there way to decouple modules and use them independently, the current az cli is kind of overkill for my workflow.
Some people are confusing or don't understand the problem. The CLI alone can only weigh a few megabytes, but the bad design of the entire CLI, not only the core part, causes it to have dependencies for more than 1.5GB and continues to grow as well commented above. For automated processes that use disposable computing where you can't cache images, downloading the 1.5GB is not optimal.
Even more I see with great surprise that some say that changing to Go we will practically not gain anything regarding the size of the CLI, surely they are the ones who love and support the current development that is a disaster.
quite true on the actual issue of dependency hell, just looking at the pip install execution logs makes ones eyes water in disbelief. Looking at other major cloud providers I can see their equivalent cli tools weigh in differently
AWS ~ 210 MB
GCP ~ 822 MB
Does it add more confusion to request a 'core' cli that is smaller but handles common activities (VMs, Blobs, AD, etc) or is that just masking the fundamental issue. think I would nearly prefer to dynamically add extensions to whatever i need rather than pulling down everything.
I've created this pr to fix the problem. https://github.com/Azure/azure-cli/pull/25801 result: Ubuntu 22.04 installed size 1,196 MB -> 322 MB Docker image size 1,300 MB -> 706 MB
The
azure-cli
package (Ubuntu Xenial) could stand to lose some weight.😲 and 😳 don't begin to describe my reaction at the size difference between the AWS CLI and the Azure CLI packages.
python3
(on platforms where it is known there is a recent enough Python 3)?_py3
variants even included ifaz
is always run with Python 3?It's also not just about the size; install speed is also a thing. It takes 50 seconds to install the single
azure-cli
package on Azure machine, with its 39 000 files (dpkg-query -L azure-cli | wc -l
: 39094), just a little less than installing the twentyish packages that comprise all ofawscli
's dependencies (that can be used by other software on the system).