Azure / azure-storage-python

Microsoft Azure Storage Library for Python
https://azure-storage.readthedocs.io
MIT License
338 stars 240 forks source link

Azure Storage Python Introduces Viral LGPL Dependencies #514

Closed illfang closed 5 years ago

illfang commented 5 years ago

Which service(blob, file, queue) does this issue concern?

blob

Which version of the SDK was used? Please provide the output of pip freeze.

$ pipenv run pip install azure-storage-blob
...
$ pipenv run pip freeze
asn1crypto==0.24.0
azure-common==1.1.16
azure-nspkg==3.0.2
azure-storage-blob==1.3.1
azure-storage-common==1.3.0
azure-storage-nspkg==3.0.0
certifi==2018.10.15
cffi==1.11.5
chardet==3.0.4
cryptography==2.3.1
enum34==1.1.6
futures==3.2.0
idna==2.7
ipaddress==1.0.22
pycparser==2.19
python-dateutil==2.7.3
requests==2.19.1
six==1.11.0
urllib3==1.23

What problem was encountered?

Installing azure-storage-blob library introduces a dependency to a LGPL library which infects any solution using the Azure Storage Library for Python. Furthermore, there is no documentation whatsoever in the Azure Storage Library, that it relies on LGPL licensed software which is an important aspect that should not be hidden.

Details:

https://github.com/Azure/azure-storage-python/blob/master/requirements.txt#L3

requests>=2.9.2

https://github.com/requests/requests/blob/master/setup.py#L51

'chardet>=3.0.2,<3.1.0',

To summarize this dependency chain:

Azure Storage Python -> Requests >= 2.9.2 -> chardet>=3.0.2,<3.1.0

Chardet License: https://github.com/chardet/chardet/blob/master/LICENSE

          GNU LESSER GENERAL PUBLIC LICENSE
               Version 2.1, February 1999

Have you found a mitigation/solution?

Mocking away chardet installing azure blob without dependencies and manually add required dependencies by hand. Not a good solution.

zezha-msft commented 5 years ago

Hi @illfang, thanks for reaching out!

I see that chardet is indeed an indirect dependency of our SDK, through the requests package, which is probably one of the most popular python libraries.

Please forgive my lack of knowledge about LGPL, but one description of the license says:

This license mainly applies to libraries. You may copy, distribute and modify the software provided that you state modifications and license them under LGPL-2.1. Anything statically linked to the library can only be redistributed under LGPL, but applications that use the library don't have to be. You must allow reverse engineering of your application as necessary to debug and relink the library.

I do not see a problem since technically we are not distributing nor modifying the chardet package; the users of our SDK are installing that dependency directly from Pypi. In addition, if you are writing an application, this license also doesn't seem to affect you.

This is really outside my area of expertise, so if you are sure that there is a problem, I can definitely escalate this issue to our legal department to double-check. Would you like me to do that?

illfang commented 5 years ago

@zezha-msft thank you for your fast response.

I believe, you are correct when saying that there is no problem for the SDK itself as the SDK is not distributing nor modifying chardet. The same arguments are used by the requests library maintainers to explain why their usage of chardet in the Apache licensed requests library is allowed. https://github.com/requests/requests/issues/3389#issuecomment-396642172

I'm not a lwayer but in my understanding, the problem starts for users of the Azure SDK (or requests library in general).

Imagine a company includes the SDK in their software which is then redistributed to their clients as a bundled application. This means they are also redistributing chardet in this case which would then fall under the LGPL license. (e.g. https://github.com/requests/requests/issues/3389#issuecomment-279252910 )

As said previously, I'm not a lawyer. This is just my understanding of the LGPL.

zezha-msft commented 5 years ago

Hi @illfang, I understand your concern.

I will raise your issue with our legal team and get back to you as soon as possible.

seguler commented 5 years ago

We're still investigating. However looking at the terms at https://github.com/chardet/chardet/blob/master/LICENSE, I see that redistribution is possible as long as you keep your source open and do not make any modifications and disclose chardet's license with your application. If you make modifications, you MUST also license your application under LGPL. I think this is good for most scenarios.

Could you tell a little about your scenario ? Do you have a need where you need to make your application closed source and redistribute including the chardet dependency files in your application ?

illfang commented 5 years ago

Thank you for your investigation. In our scenario we are thinking of a closed source client application distributed in a bundle to customers which allows the interaction with the Azure Blob Storage.

zezha-msft commented 5 years ago

Hi @illfang, here is an update:

Our legal team has asserted that the Python Storage SDK is compliant with its own MIT license and the LGPL license of chardet, since we do not modify nor distribute the package chardet, which is an indirect dependency.

Unfortunately we cannot provide legal advise on such usage by you. You need to consult your own attorneys. You'll need to decide if you want to use chardet or not based on the licensing model and distribution method that you are using.

illfang commented 5 years ago

@zezha-msft I can understand your point of view on this issue. However, to be fair to the users of the storage library you should consider adding a NOTICE file pointing out, that this MIT licensed library will introduce LGPL licensed libraries as well.

zezha-msft commented 5 years ago

@illfang thanks for your understanding! And thanks for your advice. I'll sync up with our legal team to see if it's appropriate to put a notice file in the repo. 👍