hashicorp / terraform-cdk

Define infrastructure resources using programming constructs and provision them using HashiCorp Terraform
https://www.terraform.io/cdktf
Mozilla Public License 2.0
4.79k stars 442 forks source link

Allow partial PyPi installs for cdktf-cdktf-provider-aws #2835

Open vladmiller opened 1 year ago

vladmiller commented 1 year ago

Community Note

Description

It takes a while for Python to load entire cdktf-cdktf-provider-aws bundle as well it takes quite a bit of time to install the provider because of it's size.

It would be very useful, if it was possible to install only the APIs one may require in their project. Similar like it's possible with moto:

pip install 'moto[ec2,s3,all]'

For example:

pip install 'cdktf-cdktf-provider-aws[lambda,s3]'

That would reduce the overall bandwith requirements as well as improve unit-testing for custom cdktf constructs and synth process.

ansgarm commented 1 year ago

Hi @vladmiller 👋

thank you for raising this – I think this would be something that would need some upstream work in JSII.

That said, we did recently fix the size of the pre-built packages – e.g. the AWS Python package should be 50% smaller now – maybe you can already see an improvement there?

vladmiller commented 1 year ago

Thank you, @ansgarm for the feedback. I will try and see how it goes. Maybe there is a way to provide an alternative entry point that lazy-loads only requested libraries?

rirze commented 1 year ago

The main issue is that the provider library loads all service modules in the root __init__.py file. This is unnecessary as documentation implies loading CDKTF objects via their object path rather than at the root level: from cdktf_cdktf_provider_aws.instance import Instance instead of from cdktf_cdktf_provider_aws import Instance

Simply deleting the root __init__.py shaves 50% of loading time. Hope that helps.

jsteinich commented 1 year ago

The main issue is that the provider library loads all service modules in the root __init__.py file. This is unnecessary as documentation implies loading CDKTF objects via their object path rather than at the root level:

The loading of all modules was added via https://github.com/aws/jsii/pull/3049. It is done to ensure types not directly referenced are loaded correctly. Perhaps an option could be added to not do this as it is currently less relevant for cdktf. Alternatively, a smarter lazy loading solution may be possible.

rirze commented 1 year ago

The main issue is that the provider library loads all service modules in the root __init__.py file. This is unnecessary as documentation implies loading CDKTF objects via their object path rather than at the root level:

The loading of all modules was added via aws/jsii#3049. It is done to ensure types not directly referenced are loaded correctly. Perhaps an option could be added to not do this as it is currently less relevant for cdktf. Alternatively, a smarter lazy loading solution may be possible.

Just curious, why not use the common typing conditional?:

if typing.TYPE_CHECKING:
    # load all 200 aws services for 30+ seconds

That way, synth executions take way less time.

jsteinich commented 1 year ago

Just curious, why not use the common typing conditional?:

The typing is used at runtime to lookup how to bridge between the internal JSII runtime and the python application.