pulumi / pulumi-aws

An Amazon Web Services (AWS) Pulumi resource package, providing multi-language access to AWS
Apache License 2.0
464 stars 155 forks source link

Invalid references in Pulumi package schema #2565

Closed DSoko2 closed 1 week ago

DSoko2 commented 1 year ago

@davidspielmann and I recently started looking deeper into Pulumi package schemas and bumped into some issues with named types/references in the schema of the aws classic provider. Talking about it at PulumiUP, @mnlumi encouraged us to share our insight. Perhaps it can help to get some of the references fixed with the upcoming 6.0 release :)

Background: Pulumi package schemas include "Named Types", which are references (a URI string) to a type specification. Our understanding of how these URIs are resolved from the $ref field is https://github.com/pulumi/pulumi/blob/master/pkg/codegen/schema/schema.go#L1409-L1416 Valid references are the built-in types pulumi.json#/Archive, pulumi.json#/Asset, pulumi.json#/Any, and pulumi.json#/Json. All other references have the format [origin]#/{resources,types}/[urn] where [origin] is the file path or URL of a schema file (empty for in-document references) and [urn] the identifier of the respective type definition under "resources" or "types" in that schema.

Expectation: All references in a package schema can be resolved. However, 168 references (25 URIs) in the schema of aws@5.41.0 cannot be resolved as defined above. These are the non-resolvable URIs we found; some of them are also noted in other GH issues (referenced).

These 77 references (15 URIs) cannot be resolved but are valid resources, i.e., they could be resolved when prefixing with #/resources/:

These 21 references (5 URIs) cannot be resolved but are valid types if treated case-insensitive, i.e., they could be resolved when ignoring the case and prefixing with #/types/:

These 70 references (5 URIs) cannot be resolved and we did not find a way to fix it:

This is the Python notebook we used for this analysis. It also applies the analyis to aws-native@0.65.0, akamai@4.4.0, alicloud@3.38.0, azure@5.44.0, azure-native@1.103.0, azure-native@v2.0.0-beta.1, cloudflare@5.3.0, gcp@6.58.0, google-native@0.30.0, hcloud@1.12.1, kubernetes@3.29.1, random@4.13.2. For most of them, all references are fine, with minimal exceptions (which, ideally, also would be fixed). https://gist.github.com/DSoko2/4630a14c97a9469385dceed0ab07bc1f

We hope this helps to improve this provider's schema, i.e, eliminate workarounds and non-resolvable URIs. If you should have questions or we can help somehow, please let us know.

mikhailshilkov commented 1 year ago

@DSoko2 Thanks a lot for this comprehensive analysis! It sounds like we should start from the bottom and figure out the entirely missing references.

For my understanding, how blocked are you on this? Are you able to ignore them and proceed with other tasks, or is there anything we can do to unblock you quickly?

DSoko2 commented 1 year ago

For now, we are not blocked because we can hack around it with the knowledge we got through the analysis.

Still, I think fixing these issues is important: The longer these inconsistencies consist, the more workarounds people in the ecosystem may implement. It also may question the reliability of the schemas in general given the popularity of this provider.

It sounds like we should start from the bottom and figure out the entirely missing references.

Yes, this would certainly be valuable! However, fixing the other ones (First two lists in the initial post with workarounds) seems to me like it could be done rather soon and would prevent people like us from adding hacky resolution workarounds to their tools. Certainly, fixing them could still impact schema consuming implementations that solely implemented the workaround but I would argue that the default, documented way of resolving URIs should have priority. Eventually, consuming implementations with workarounds that also support the documented URI resolution shouldn't be impacted.

mnlumi commented 1 year ago

@DSoko2 Thank you so much for sharing your findings and opening this issue! Much appreciated!

t0yv0 commented 2 months ago

Quick update, I'm checking up on how we are doing here in the 6x series of AWS. I can confirm that the following references still do not resolve to entries under .types.

    schema_test.go:38: Dangling reference: aws:alb/ipAddressType:IpAddressType (isRes=false)
    schema_test.go:38: Dangling reference: aws:alb/loadBalancerType:LoadBalancerType (isRes=false)
    schema_test.go:38: Dangling reference: aws:apigateway/deployment:Deployment (isRes=true)
    schema_test.go:38: Dangling reference: aws:apigateway/restApi:RestApi (isRes=true)
    schema_test.go:38: Dangling reference: aws:autoscaling/metrics:Metric (isRes=false)
    schema_test.go:38: Dangling reference: aws:autoscaling/notificationType:NotificationType (isRes=false)
    schema_test.go:38: Dangling reference: aws:cloudwatch/logGroup:LogGroup (isRes=true)
    schema_test.go:38: Dangling reference: aws:ec2/launchConfiguration:LaunchConfiguration (isRes=true)
    schema_test.go:38: Dangling reference: aws:ec2/placementGroup:PlacementGroup (isRes=true)
    schema_test.go:38: Dangling reference: aws:ecr/lifecyclePolicyDocument:LifecyclePolicyDocument (isRes=false)
    schema_test.go:38: Dangling reference: aws:elasticbeanstalk/application:Application (isRes=true)
    schema_test.go:38: Dangling reference: aws:elasticbeanstalk/applicationVersion:ApplicationVersion (isRes=true)
    schema_test.go:38: Dangling reference: aws:iam/Role:Role (isRes=false)
    schema_test.go:38: Dangling reference: aws:iam/documents:PolicyDocument (isRes=false)
    schema_test.go:38: Dangling reference: aws:iam/group:Group (isRes=true)
    schema_test.go:38: Dangling reference: aws:iam/instanceProfile:InstanceProfile (isRes=true)
    schema_test.go:38: Dangling reference: aws:iam/role:Role (isRes=true)
    schema_test.go:38: Dangling reference: aws:iam/user:User (isRes=true)
    schema_test.go:38: Dangling reference: aws:index/aRN:ARN (isRes=false)
    schema_test.go:38: Dangling reference: aws:index/region:Region (isRes=false)
    schema_test.go:38: Dangling reference: aws:iot/policy:Policy (isRes=true)
    schema_test.go:38: Dangling reference: aws:lambda/function:Function (isRes=true)
    schema_test.go:38: Dangling reference: aws:rds/engineType:EngineType (isRes=false)
    schema_test.go:38: Dangling reference: aws:s3/bucket:Bucket (isRes=true)
    schema_test.go:38: Dangling reference: aws:s3/routingRules:RoutingRule (isRes=false)
    schema_test.go:38: Dangling reference: aws:sns/topic:Topic (isRes=true)
t0yv0 commented 2 months ago

I've taken stock off the non-resource dangling references. All these originate from Node-only type overlays. There are two ways forward with these:

  1. remove Node-only overlays and support the functionality for every language; breaking change waiting for AWS v7
  2. register the overlay tokens in the schema; unfortunately this is blocked on https://github.com/pulumi/pulumi/issues/17274 at the moment

I suspect our preference is (1) but I'll follow up with the team.

I will also follow up on the resource danging references next.

t0yv0 commented 2 months ago

Notes on resource references. Currently due to limitations in the bridge, resource references are exposed as type references.

For example, looking at "aws:cloudwatch/logSubscriptionFilter:LogSubscriptionFilter":

                "logGroup": {
                    "type": "string",
                    "oneOf": [
                        {
                            "type": "string"
                        },
                        {
                            "type": "string",
                            "$ref": "#/types/aws:cloudwatch/logGroup:LogGroup"
                        }
                    ],
                    "description": "The name of the log group to associate the subscription filter with\n",
                    "willReplaceOnChanges": true
                },

It is possible to change that to use a resource reference which still works:

                "logGroup": {
                    "type": "string",
                    "oneOf": [
                        {
                            "type": "string"
                        },
                        {
                            "type": "string",
                            "$ref": "#/resources/aws:cloudwatch/logGroup:LogGroup"
                        }
                    ],
                    "description": "The name of the log group to associate the subscription filter with\n",
                    "willReplaceOnChanges": true
                },

This change generates no diffs for Node but it does generate slight changes in other languages:

-        public Input<string> LogGroup { get; set; } = null!;
+        public InputUnion<string, Pulumi.Aws.CloudWatch.LogGroup> LogGroup { get; set; } = null!;

-                 log_group: pulumi.Input[str],
+                 log_group: pulumi.Input[Union[str, 'LogGroup']],

Java:

-    private Output<String> logGroup;
+    private Output<Either<String,LogGroup>> logGroup;

-        public Builder logGroup(String logGroup) {
+        public Builder logGroup(Either<String,LogGroup> logGroup) {
             return logGroup(Output.of(logGroup));
         }

+        /**
+         * @param logGroup The name of the log group to associate the subscription filter with
+         *
+         * @return builder
+         *
+         */
+        public Builder logGroup(String logGroup) {
+            return logGroup(Either.ofLeft(logGroup));
+        }
+
+        /**
+         * @param logGroup The name of the log group to associate the subscription filter with
+         *
+         * @return builder
+         *
+         */
+        public Builder logGroup(LogGroup logGroup) {
+            return logGroup(Either.ofRight(logGroup));
+        }
+

I think these changes are not breaking on the input side, but can be technically breaking on the output side (in the Java example above). This is sufficiently well-contained this could be something we can clean up without waiting for a major release.

t0yv0 commented 2 months ago

Further discussion with @mikhailshilkov we are contemplating the possibility to construct two schemas for the moment. One schema similar to the present schema will be used for the purposes of:

  1. having overlays in the registry as in https://www.pulumi.com/registry/packages/aws/api-docs/lambda/callbackfunction/
  2. having overlay references as a way to communicate to the Node SDK generator to avoiding breaking the Node SDK for the users

The second variation of the schema will have all overlay types stripped down, not have any dangling references, and accurately describe non-Node SDKs.

This scheme can unblock use cases until we are able to achieve full unification in v-next.

t0yv0 commented 1 month ago

Since https://github.com/pulumi/pulumi-aws/releases/tag/v6.55.0 the provider includes a feature to serve a schema without any invalid references (see https://github.com/pulumi/pulumi-aws/pull/4587). This is currently under a feature flag. We are unable to make this the official schema yet until the next major release because it involves taking breaking changes in the Node JS SDKs. Please take a look and let us know if this helps your use cases.

zbuchheit commented 1 month ago

@t0yv0 does this work with pulumi package get-schema or could we add support for it?

t0yv0 commented 1 month ago

I was expecting this to work but verifying on latest it seems that it is not working properly. I'll have closer look.

PULUMI_AWS_MINIMAL_SCHEMA=true
anton@anton-mbp-m3> pulumi package get-schema aws  | shasum   
41c5c74551c17d67580e8fb1a4033c8c8bd23fcf  -
anton@anton-mbp-m3> PULUMI_AWS_MINIMAL_SCHEMA=false                                         
anton@anton-mbp-m3> pulumi package get-schema aws  | shasum                                  
41c5c74551c17d67580e8fb1a4033c8c8bd23fcf  -
zbuchheit commented 1 month ago

let me know if it would be helpful if I opened a separate issue tracking that feature

t0yv0 commented 1 month ago

That won't be necessary. Looks like I got the tag references wrong, apologies, https://github.com/pulumi/pulumi-aws/pull/4587 did not make it into the 6.55.0 release yet. We will be releasing with the next minor update (building that now https://github.com/pulumi/pulumi-aws/pull/4638) and I will double check that get-schema works.

mikhailshilkov commented 1 month ago

4587 got shipped in 6.56.0

t0yv0 commented 1 month ago

We're preparing another release which will also include https://github.com/pulumi/pulumi-aws/pull/4640 , I'll comment here once I've confirmed this is working.

t0yv0 commented 1 month ago

With the v6.56.1 release pulumi package get-schema aws respects PULUMI_AWS_MINIMAL_SCHEMA=true, please have a look and let us know if this is helpful. CC @zbuchheit

t0yv0 commented 1 week ago

Closing per conversation with @zbuchheit - thanks everyone.