koxudaxi / py-data-api

A user-friendly client for AWS Aurora Serverless's Data API
https://koxudaxi.github.io/py-data-api
MIT License
40 stars 9 forks source link

Reduce the size of dependency packages #33

Open koxudaxi opened 5 years ago

koxudaxi commented 5 years ago

This issue is created to research and the size of dependency packages. @Rubyj said these packages waste a lof of size. The problem may touch the problem to over 50MB lambda package.

Package Size

$ pip install pydataapi 
$ du -d 1 -m venv/lib/python3.7/site-packages/  |sort -n |tail -n7
2   venv/lib/python3.7/site-packages/pip
3   venv/lib/python3.7/site-packages/setuptools
4   venv/lib/python3.7/site-packages/docutils
10  venv/lib/python3.7/site-packages/sqlalchemy
19  venv/lib/python3.7/site-packages/pydantic
43  venv/lib/python3.7/site-packages/botocore
84  venv/lib/python3.7/site-packages/
Rubyj commented 5 years ago

I think we could maybe remove this line: https://github.com/koxudaxi/py-data-api/blob/a52bd216a2eb69583d9b05860009875d120ba7c0/setup.cfg#L28

In lambda environment boto3 already exists and re-install wastes space.

koxudaxi commented 4 years ago

@Rubyj umm, I'm worried to remove the dependency.

  1. The user needs to install boto3 when runs outside lambda.
  2. We usually want to control the boto3 version to keep behavior.

But, I feel good as a start point. I make to grow up with your idea. We can add boto3 as an option.

$ cat requirements.txt
 pydataapi[boto3]

otherwise, lambda better practice is to use the lambda layer to save the capacity of the lambda package. https://docs.aws.amazon.com/lambda/latest/dg/configuration-layers.html

Also, We may distribute lambda layer for everyone like this https://github.com/DataDog/datadog-lambda-layer-python We must check the price before distribute :sweat:

How do you think about my thoughts?

Rubyj commented 4 years ago

I think your thoughts make sense. The best solution is to probably use layer for big packages that can be shared across functions.

I can probably add pandas in a zip file to a layer and use that.

Rubyj commented 4 years ago

Maybe we could provide a way for user to install pydataapi without boto3? That way for local development we install with boto3, but for deploy on lambda, chalice can install without boto3?

I was thinking some sort of flag in the requirements file. Maybe:

requirements.txt pydataapi --no-deps

Or something like that.

koxudaxi commented 4 years ago

@Rubyj Sorry, I cloudn't try it yesterday. I search for an option to disable dependency today:eyes:

koxudaxi commented 4 years ago

@Rubyj

requirements.txt doesn't accepts --no-deps as an option.

Let's accept extra options for dependencies on requirements.txt, Pipfile or etc. https://setuptools.readthedocs.io/en/latest/setuptools.html#declaring-extras-optional-features-with-their-own-dependencies We can choice dependencies by extra options on setup.py.

I have some ideas.

  1. default is to include all dependency. If we want to remove boto3 from dependency then, we can pass without-boto3 or no-deps as an extra option. But, I think this is not a smart option name :thinking: Do you know more good words? or Do you feel these options? Also, We must install sqlalchemy and pydantic manually when we set no-deps.

  2. default is only pydantic which is a standard dependency. We can set boto3, sqlalchemy or all as an extra option. The case may confuse users who don't know why to not install boto3 as default. we must write documents to notice everyone.

Did you think about it?

@Baruch4413 If you have thoughts then, would you please write it here?

Rubyj commented 4 years ago

Maybe we can pass something like no-boto3? I am ok with installing all dependencies myself, but yes I think you are right and that might confuse people.

koxudaxi commented 4 years ago

Maybe we can pass something like no-boto3?

I'm trying to pass no-boto3, but it's too difficult to exclude boto3 from requires. extras_require don't support default value :man_facepalming: It means boto3 is not installed without options.

koxudaxi commented 4 years ago

@Rubyj I thought to remove boto3 from dependency today. It does not seems good.

pydataapi needs the correct version boto3. But, a boto3 is not fixed where it is in the lambda environment. The problem is very serious when the version of boto3 is unexpected.

We may give up to remove boto3 from dependency. And, we should publish and use layers

Rubyj commented 4 years ago

I understand. I will use layers!