Azure / azure-cli

Azure Command-Line Interface
MIT License
4.01k stars 2.98k forks source link

azure-cli package could lose some weight #7387

Open akx opened 6 years ago

akx commented 6 years ago

The azure-cli package (Ubuntu Xenial) could stand to lose some weight.

😲 and 😳 don't begin to describe my reaction at the size difference between the AWS CLI and the Azure CLI packages.

Package: awscli
Version: 1.11.13-1ubuntu1~16.04.0
Depends: python3, python3-botocore (>= 1.4.70), python3-colorama, python3-docutils, python3-rsa, python3-s3transfer, python3:any (>= 3.3.2-2~)
Installed-Size: 2.9 MB

Package: azure-cli
Version: 2.0.45-1~xenial
Installed-Size: 347 MB
Depends: libc6 (>= 2.17), libssl1.0.0 (>= 1.0.2~beta3)

It's also not just about the size; install speed is also a thing. It takes 50 seconds to install the single azure-cli package on Azure machine, with its 39 000 files (dpkg-query -L azure-cli | wc -l: 39094), just a little less than installing the twentyish packages that comprise all of awscli's dependencies (that can be used by other software on the system).

yugangw-msft commented 6 years ago

CC @mayurid @troydai @johanste @lmazuel, this is known issue. We did some improvement for windows installer, but same kind of improvement can be done on Linux as well.

troydai commented 6 years ago

Thank you for the feedback. You made a great point. We will definitely look into this. I will keep this issue open and assign it to myself.

yugangw-msft commented 5 years ago

@marstr, could you please prioritize this work and get it done sometime in April?

marstr commented 5 years ago

I'm excited to take care of this :)

simonbrady commented 5 years ago

@marstr Thanks for picking this up, I've just been through another painfully slow upgrade (to 2.0.60 under WSL/Ubuntu) so I can assure you this fix will be welcomed!

benc-uk commented 5 years ago

Geat that this is finally getting looked at!

Every time I update the CLI, I kiss goodbye to my machine for ~20 minutes. I dread updating it.
Bundling Python with the CLI seems like madness

Looking forward to these changes

marstr commented 5 years ago

20 minutes? Yikes. Which platform are you using @benc-uk? Maybe Ubuntu via WSL?

benc-uk commented 5 years ago

Yes. A very common configuration I find at customers and partners. First it's the actual update and then Windows Defender going absolutely crazy

I've had some snide comments from customers about the CLI in this regard

marstr commented 5 years ago

All the more reason to get this taken care of. Thanks for the honest feedback.

troydai commented 5 years ago

The WSL file system being slow when transfer large number of files were the problem when I looked at it months ago. Windows Defender is more about the cold start time of the command.

benc-uk commented 5 years ago

It's a pipe dream at this point as I can see the amount of work that's gone into this Python version of the CLI, but...

A single executable (i.e. based on golang) would be ideal. I've been working with Kubernetes, Helm and Terraform. Their tooling is written in Golang and it's really nice just having this single static binary you can put anywhere

EDIT: the new AzCopy V10 has gone down this route

marstr commented 5 years ago

If you look into my GitHub profile, you'll see that I'm a huge fan of Go and have spent a non-trivial part of my career on/with it. I love Terraform and k8s, and am envious of their ease of distribution. TBH, there are a lot of benefits to Go, but with a project like ours where we need to support many many contributors across the company, asking people to learn Go is a much heavier lift than asking them to use/learn Python. Not to mention, I don't think this product is at burn-it down and rewrite quality just yet ;)

benc-uk commented 5 years ago

I totally understand, that's why I prefixed my comment with "this is a pipe dream" 😄 Having done some Go myself it's quite the context switch from other languages

Hopefully some tidy up of the packaging is all that is required to get things speedy again

sptramer commented 5 years ago

@marstr Will slimming the .deb package finally remove the bundled Python interpreter and finally rely on the system Python? This will affect install instructions.

marstr commented 5 years ago

Yes! That's what I'm working on right now.

benc-uk commented 5 years ago

Any update? Just updated my CLI again under WSL and got the same maxed out machine for ~20 mins, strangely enough I wanted to use my machine for other things at the same time but it was churning too bad

marstr commented 5 years ago

This got side tracked a bit (sorry, should have updated this thread.) There was some debate internally about whether or not taking a dependency on Python instead of bundling our own would be sacrificing any stability. We've pushed past that now, by agreeing that this approach has been working fine for Homebrew. While that debate was happening, my focus shifted to other projects.

However, the good news is that I"m scheduled to work on this again soon!

marstr commented 5 years ago

The teams getting shuffled, and I'm moving on to a different team! However, I've met with @zikalino and showed him how important this issue is, and how much interest we should be taking in fixing this. I've re assigned this issue to him :)

ssbarnea commented 5 years ago

I ended up to this bug after I was surprised about how long it was taking to install azure-cli and also annoyed by the fact that it caused some conflict with other tools (including azure python sdk itself!). Apparently you cannot install azure and azure-cli without generating conflicts... something that can easily be prevented with a simple pip check at the end of the build process.

All the kudos for @akx for the way he highlighted the problem. That issue alone has the potential of becoming a meme about how not to package something.

krojzl commented 5 years ago

Since issue #9665 was closed and this issue was marked as successor issue - is there any outlook when failing installation bug might be resolved?

lawrencegripper commented 5 years ago

Is this something that you'd be happy for someone to pickup and PR? Any guide on how complex the resolution is for someone who did one to pick this up?

krojzl commented 4 years ago

Any outlook when azure-cli installation issue (original item #9665 ) might get implemented?

tp199314 commented 4 years ago

@krojzl i just found the following comment in the old thread. I hope, its useful for you. https://github.com/Azure/azure-cli/issues/9665#issuecomment-559673532

joshtriplett commented 4 years ago

Following up on this. I'd love to see azure-cli packages that work with the system Python. Is this still being worked on?

bluca commented 4 years ago

Hello,

FYI, azure-cli (and the devops extension) are now available natively in Debian unstable (will be added to buster-backports as soon as policy allows) and the soon-to-be-released Ubuntu 20.04 LTS:

https://packages.ubuntu.com/focal/azure-cli https://packages.debian.org/sid/azure-cli

This has been done following policy&best practices. Each dependency is packaged separately, and the system's python installation is used rather than vendorizing everything including the interpreter and its dependencies.

yungezz commented 4 years ago

this is planned in Mn as lightweight installer.

Bessonov commented 3 years ago

Hi guys, is there any progress on it? I'm preparing an alpine builder image for ci and run in this and https://github.com/Azure/azure-cli/issues/13028 issue.

Size The most important pain point is the size. The image grows to 1.7 GB. But even official image is incredible HUGE:

mcr.microsoft.com/azure-cli      2.25.0       249845e13ca3   11 days ago     1.04GB

I mean, c'mon, 1GB for a cli? It's two times bigger than windows xp image. I expect it to be around 5mb or less.

Time Installing from pip takes around 10 minutes on localhost and includes compiling, wheels and so on. On CI it takes around 5 minutes because of better network. But still relates to the size.

My wish

Prepare appropriate executables. Don't force us to compile and install random things like python, pip, make etc. and fight them. It's fragile and painful. It's horrible for users of cli. Here is an example of how easy it can be: https://github.com/docker/compose/releases :

RUN curl -L "https://github.com/docker/compose/releases/download/1.29.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
RUN chmod u+x /usr/local/bin/docker-compose

. And it's still a python-based cli.

I'm not sure how it can grow to such size. Just imagine 1GB. How many lines of code it is?

// offtopic Why go? Why not rust? Why not typescript? ...why not still python, but better? // offtopic2 I can't image how people can love go... or dart...

jiasli commented 3 years ago

The size of Azure CLI is due to the hugeness of underlying Azure Python SDKs (https://github.com/Azure/azure-sdk-for-python/issues/11149). If you install from pip or official packages, you will install all underlying Python SDKs that include all API versions, even though most of them are historical versions and unnecessary.

There was indeed an attempt to remove unnecessary API versions (#17816), but it only removes unused API versions from azure-mgmt-network SDK for Windows installer (MSI).

We are working with Azure Python SDK team to see how we can reduce the size of Azure Python SDKs, so that Azure CLI can lose some weight.

nolaexe commented 3 years ago

Hey, is there any update about the issue?

illgitthat commented 3 years ago

@jiasli is this still on the high priority list? I didn't see this on that board anymore.

Issue is now three years old so it would be great to have an update on the current timeline.

jiasli commented 3 years ago

The azure-cli package itself is pretty small, if you check it on PyPI: https://pypi.org/project/azure-cli/#files

image

As provided in https://github.com/Azure/azure-cli/issues/7387#issuecomment-879529560, the size of Azure CLI is caused by the dependency on Azure Python SDKs. We will consider trimming the SDKs and removing unused API versions, but this is not one of the top-priority tasks.

illgitthat commented 3 years ago

The azure-cli package itself is pretty small, if you check it on PyPI: https://pypi.org/project/azure-cli/#files As provided in #7387 (comment), the size of Azure CLI is caused by the dependency on Azure Python SDKs. We will consider trimming the SDKs and removing unused API versions, but this is not one of the top-priority tasks.

Great, thanks for the response!

jiasli commented 3 years ago

I checked with ncdu tool on Ubuntu, the azure-mgmt-xxx SDKs now takes 613.7MB, while azure-cli only takes 23.9MB:

image

Amongst them, azure-mgmt-network is still the largest:

image

Could you vote on https://github.com/Azure/azure-sdk-for-python/issues/11149 so that Azure SDK team can get more attention? Thanks.

akx commented 3 years ago

@jiasli

The azure-cli package itself is pretty small, if you check it on PyPI: https://pypi.org/project/azure-cli/#files

The point that that package itself is small is still moot (and somewhat disingenuous) if you can't properly install a "modular" azure-cli.

You can botch together something like

$ pip install --no-dependencies azure-cli
$ pip install azure-cli-core msrestazure packaging

to get a skeleton of Azure CLI (at the small, small size of 74 megabytes of installed data) that throws warnings and errors but technically starts, and then hope to start installing some azure-mgmt-* modules to get to a working version, but that's surely not how things are supposed to be?

sodul commented 3 years ago

We have written a script that deletes the unreferenced older APIs. It does delete a good chunk of the API folders but not all of it. With the script the Azure directory is now just under 300MB instead of over 700MB. It is compatible with most, but not all, third party packages, as long as they do not point to a version that is trimmed.

https://github.com/clumio-code/azure-sdk-trim

jarombouts commented 3 years ago

Over 3 years since this ticket was opened. And yet every time I need the azure command line tools in my CI/CD pipeline, I'm spending a gigabyte and more than a minute of build time on this monstrosity. Some way of selectively installing the bits you need would be nice...

The following NEW packages will be installed: azure-cli ... After this operation, 1031 MB of additional disk space will be used.

mikeball commented 2 years ago

I have been working on setting up ci/cd which needs only to push docker image to azure container registry and restart an app... and really surprised to see 1.5GB+ required for the cli for such minor tasks. Perhaps one can use some REST api directly and build some really small utility for specific tasks but not sure how to do that?

jiasli commented 2 years ago

Perhaps one can use some REST api directly and build some really small utility for specific tasks but not sure how to do that?

This (use some REST api directly) is exactly what we are investigating in the Microsoft Graph migration (https://github.com/Azure/azure-cli/issues/12946). If it turns out to be working well, we may incorporate this method in all future command modules or code-gen'ed modules.

ringerc commented 2 years ago

I was stunned to find that my diagnostic Docker containers grew by over 1GB when I installed the Azure CLI. This is ... astonishing.

deyvsh commented 2 years ago

Appreciate the work going on to slim down the install! In case it's of interest, our use case is to run it inside AWS CloudShell, which has a home dir (the only writable bit) size limit of 1GB, and we only need enough of the CLI to support az login.

benc-uk commented 2 years ago

I don't think there's anyway out of this now, beyond a full rewrite in Go, Rust or Dotnet etc, something that can compile to a single binary

Each week that passes the CLL gets larger and larger, and the chances of a rewrite smaller and smaller

Where does it stop? When the CLI takes up 2GB? 5GB? 50GB?

sodul commented 2 years ago

@benc-uk Have you tried https://github.com/clumio-code/azure-sdk-trim ?

I wrote this a while back to delete older API directories that are obviously superfluous in 99% of cases and the site-packages/azure directory goes from 846.6 MB to 348.1 MB for us. That's not perfect, it is still ridiculously large, but until the Azure developers actually do something about this problem it does help quite a bit.

If you call it from a Dockerfile make sure to have it in the same RUN call as when the Azure cli/sdk is installed or you will still get the overhead in the layers.

jiasli commented 2 years ago

@benc-uk, If you do an ncdu, you will find Azure CLI itself (/cli, 28.6 MB) is pretty small:

image

It is actually the Azure Python SDK (/mgmt, 676.7 MB) that makes Azure CLI huge. See https://github.com/Azure/azure-cli/issues/7387#issuecomment-879529560 for the explanation.

Rewriting in Go, Rust, .NET won't give too much benefit as those Azure SDKs are equally huge.

benc-uk commented 2 years ago

Agreed, but with a compiled language you ship a binary, you don't ship the SDK with it!

If I wrote a Go app that used the Go SDK for Azure, used a few functions from the SDK, and compiled it - it would be completely standalone executable and only a few megabytes

usrme commented 1 year ago

I still want to see a version of Azure CLI that can be composed of different components more easily, but I've written a blog post that can hopefully help some people out: https://usrme.xyz/posts/how-to-trim-a-container-image-that-includes-azure-cli/. Using the methods described there I was able to slim an image from 1.17GB to 307MB!

psadi commented 1 year ago

I was trying to find a cli mainly for azure devops. Installing the cli literally (via pip) shocked me. My usage is just to manage repositories & pull requests.

I understand the installer itself is pretty small, but having 1G of dependencies is not ideal, at-least for my use case

image

Isn't there way to decouple modules and use them independently, the current az cli is kind of overkill for my workflow.

ivanechegaray commented 1 year ago

Some people are confusing or don't understand the problem. The CLI alone can only weigh a few megabytes, but the bad design of the entire CLI, not only the core part, causes it to have dependencies for more than 1.5GB and continues to grow as well commented above. For automated processes that use disposable computing where you can't cache images, downloading the 1.5GB is not optimal.

Even more I see with great surprise that some say that changing to Go we will practically not gain anything regarding the size of the CLI, surely they are the ones who love and support the current development that is a disaster.

mpender commented 1 year ago

quite true on the actual issue of dependency hell, just looking at the pip install execution logs makes ones eyes water in disbelief. Looking at other major cloud providers I can see their equivalent cli tools weigh in differently AWS ~ 210 MB
GCP ~ 822 MB

Does it add more confusion to request a 'core' cli that is smaller but handles common activities (VMs, Blobs, AD, etc) or is that just masking the fundamental issue. think I would nearly prefer to dynamically add extensions to whatever i need rather than pulling down everything.

bebound commented 1 year ago

I've created this pr to fix the problem. https://github.com/Azure/azure-cli/pull/25801 result: Ubuntu 22.04 installed size 1,196 MB -> 322 MB Docker image size 1,300 MB -> 706 MB