pulumi / crd2pulumi

Generate typed CustomResources from a Kubernetes CustomResourceDefinition
Apache License 2.0
100 stars 17 forks source link

CRDS Gross type name generated in Python SDK for network policy #107

Closed tusharshahrs closed 1 year ago

tusharshahrs commented 1 year ago

What happened?

CRD has fields named with underscores: For example, the networkpolicy.yaml will have:

  versions:
    - name: v1alpha1
      schema:
        openAPIV3Schema:
          properties:
            spec:
              type: object
              description: ....NetworkPolicySpec is a specification of the network
                entitlements for a pod.
              properties:
                apps_incoming:
                ..
                ..

The main thing is the _ in apps_incoming. We need to translate lower snake case to camel and back dynamically. The fields to be named properly per language, but translate back to the proper names in the provider.

This is what the networkpolicies/pulumi_crds/juice/v1alpha1/_inputs.py will end up as after crd2pulumi is ran:

__all__ = [
    'NetworkPolicySpecApps_incomingArgs',
    'NetworkPolicySpecApps_outgoingArgs',
    'NetworkPolicySpecNamespaces_incomingArgs',
    'NetworkPolicySpecNamespaces_outgoingArgs',
    'NetworkPolicySpecArgs',
]

This is what the networkpolicies/pulumi_crds/juice/v1alpha1/outputs.py will end up as:

__all__ = [
    'NetworkPolicySpecApps_incomingArgs',
    'NetworkPolicySpecApps_outgoingArgs',
    'NetworkPolicySpecNamespaces_incomingArgs',
    'NetworkPolicySpecNamespaces_outgoingArgs',
    'NetworkPolicySpecArgs',
]

Expected Behavior

When using the snippet below

  versions:
    - name: v1alpha1
      schema:
        openAPIV3Schema:
          properties:
            spec:
              type: object
              description: NetworkPolicySpec is a specification of the network
                entitlements for a pod.
              properties:
                apps_incoming:
                ..
                ..

after running crd2pulumi, the python generated code at: networkpolicies/pulumi_crds/juice/v1alpha1/__inputs.py would show up as:

__all__ = [
    'NetworkPolicySpecAppsIncomingArgs',
    'NetworkPolicySpecAppsOutgoingArgs',
    'NetworkPolicySpecNamespacesIncomingArgs',
    'NetworkPolicySpecNamespacesOutgoingArgs',
    'NetworkPolicySpecArgs',
]

There are no _ anymore and now everything is camelCase.

The code for networkpolicies/pulumi_crds/juice/v1alpha1/outputs.py would also be correct:

__all__ = [
    'NetworkPolicySpec',
    'NetworkPolicySpecAppsIncoming',
    'NetworkPolicySpecAppsOutgoing',
    'NetworkPolicySpecNamespacesIncoming',
    'NetworkPolicySpecNamespacesOutgoing',
]

There are no _ anymore and now everything is camelCase.

Steps to reproduce

  1. Download and install crd2pulumi
  2. mkdir crds
  3. cd crds
  4. create a file called networkpolicy.yaml with the following code:
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  annotations:
    myinfo: abcdefghijkl
  generation: 4
  labels:
    creator.ac.com: myinfo
  name: networkpolicies.juice.box.com
spec:
  conversion:
    strategy: None
  group: juice.box.com
  names:
    kind: NetworkPolicy
    listKind: NetworkPolicyList
    plural: networkpolicies
    shortNames:
      - anp
      - anps
    singular: networkpolicy
  preserveUnknownFields: true
  scope: Namespaced
  versions:
    - name: v1alpha1
      schema:
        openAPIV3Schema:
          properties:
            spec:
              type: object
              description: NetworkPolicySpec is a specification of the network
                entitlements for a pod.
              properties:
                apps_incoming:
                  description: apps_incoming specifies which applications are permitted
                    to establish a TCP connection to a POD.
                  items:
                    properties:
                      app:
                        pattern: ^(__kubernetes__|plb\.juice-plb\.juice-prod|((([A-Za-z0-9]+[-A-Za-z0-9]?)*[A-Za-z0-9])\.){2}(kk|kube))$
                        type: string
                      cluster:
                        description: cluster that this policy applies to. Defaults to
                          the local cluster. Setting cluster to 'ALL' will match all
                          clusters
                        type: string
                    required:
                      - app
                    type: object
                  type: array
                apps_outgoing:
                  description: apps_outgoing specifies what applications a pod may attempt
                    to make TCP connections to.
                  items:
                    properties:
                      app:
                        pattern: ^(__kubernetes__|plb\.juice-plb\.juice-prod|((([A-Za-z0-9]+[-A-Za-z0-9]?)*[A-Za-z0-9])\.){2}(kk|kube))$
                        type: string
                      cluster:
                        description: cluster that this policy applies to. Defaults to
                          the local cluster. Setting cluster to 'ALL' will match all
                          clusters
                        type: string
                    required:
                      - app
                    type: object
                  type: array
                namespaces_incoming:
                  description: namespaces_incoming specifies which kubernetes namespace
                    are permitted to establish incoming TCP sessions.
                  items:
                    properties:
                      cluster:
                        description: cluster that this policy applies to. Defaults to
                          the local cluster. Setting cluster to 'ALL' will match all
                          clusters
                        type: string
                      namespace:
                        pattern: ^(((([A-Za-z0-9]+[-A-Za-z0-9]?)*[A-Za-z0-9])\.)(kk|kube))$
                        type: string
                    required:
                      - namespace
                    type: object
                  type: array
                namespaces_outgoing:
                  description: namespaces_outgoing specifies which kubernetes namespace
                    are permitted to establish outgoing TCP sessions.
                  items:
                    properties:
                      cluster:
                        description: cluster that this policy applies to. Defaults to
                          the local cluster. Setting cluster to 'ALL' will match all
                          clusters
                        type: string
                      namespace:
                        pattern: ^(((([A-Za-z0-9]+[-A-Za-z0-9]?)*[A-Za-z0-9])\.)(kk|kube))$
                        type: string
                    required:
                      - namespace
                    type: object
                  type: array
                selector:
                  additionalProperties:
                    type: string
                  description: selector is a set of label selectors
                  type: object
              required:
                - selector
      served: true
      storage: true
status:
  acceptedNames:
    kind: NetworkPolicy
    listKind: NetworkPolicyList
    plural: networkpolicies
    shortNames:
      - anp
      - anps
    singular: networkpolicy
  storedVersions:
    - v1alpha1
  1. Run crd2pulumi --pythonPath ./networkpolicies networkpolicy.yaml

  2. Check out the new crds that are created: cd networkpolicies/pulumi_crds/pulumi_crds/juice/v1alpha1

  3. Go to where the _inputs.py outputs.py are:
    cd networkpolicies/pulumi_crds/juice/v1alpha1/

  4. The output for _inputs.py will show up in the all block as:

    ..
    ..
    __all__ = [
    'NetworkPolicySpecApps_incomingArgs',
    'NetworkPolicySpecApps_outgoingArgs',
    'NetworkPolicySpecNamespaces_incomingArgs',
    'NetworkPolicySpecNamespaces_outgoingArgs',
    'NetworkPolicySpecArgs',
    ]
    ..

    Notice that there is a _ and no camelCasing for the first 4 items.

The output from outputs.py will show up as

__all__ = [
    'NetworkPolicySpec',
    'NetworkPolicySpecApps_incoming',
    'NetworkPolicySpecApps_outgoing',
    'NetworkPolicySpecNamespaces_incoming',
    'NetworkPolicySpecNamespaces_outgoing',
]

Notice that there is a _ and no camelCasing for the last 4 items.

Output of pulumi about

CLI          
Version      3.56.0
Go Version   go1.20.1
Go Compiler  gc

Host     
OS       darwin
Version  11.7.4
Arch     x86_64

Backend        
Name           pulumi.com
URL            https://app.pulumi.com/tushar-pulumi-corp
User           tushar-pulumi-corp
Organizations  tushar-pulumi-corp,

Output from ❯ crd2pulumi version

v1.2.3

Additional context

No response

Contributing

Vote on this issue by adding a 👍 reaction. To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).

rquitales commented 1 year ago

Found a similar issue, where - is used instead of _. Looks like we'd need this to be fixed in upstream pulumi/pulumi.

justinvp commented 1 year ago

The SDK generators generally expect type and property names in the Pulumi schema to be PascalCase and camelCase respectively.

Type Names

For type names, there's no reason why crd2pulumi couldn't make the names PascalCase, since type names are meaningless at runtime in terms of what is serialized and passed through to the engine to the provider.

Right now the in-memory representation of the schema that crd2pulumi is generating has types named like:

"types": {
    "kubernetes:juice.box.com/v1alpha1:NetworkPolicySpecApps_incoming": {
    }
}

When instead it could be named as:

"types": {
    "kubernetes:juice.box.com/v1alpha1:NetworkPolicySpecAppsIncoming": {
    }
}

Property Names

For property names, I don't think crd2pulumi could change the names to camelCase since the property names are passed along to the k8s provider as-is and I don't think the k8s provider is doing any tweaking of property names at runtime.

So for a property like apps_incoming, crd2pulumi would have to keep it named like that in the Pulumi schema.

For Python, the property will be emitted in the SDK as apps_incoming, which is fine for Python, but for other SDKs like Go and Node, the property name will look out-of-place:

type NetworkPolicySpecArgs struct {
    Apps_incoming NetworkPolicySpecAppsIncomingArrayInput `pulumi:"apps_incoming"`
}
export interface NetworkPolicySpecArgs {
    apps_incoming?: pulumi.Input<pulumi.Input<inputs.juice.v1alpha1.NetworkPolicySpecAppsIncomingArgs>[]>;
}

For Go, we could consider changing the SDK gen to make apps_incoming be emitted as a field named AppsIncoming because there is a struct tag on the field which specifies the name that should be used during marshaling, which in this case could remain apps_incoming. We could similarly change C# SDK gen to do the same.

However, I don't think we can do anything for TypeScript right now since there isn't currently a way to add extra metadata to indicate a different name to use during marshaling, so it'd have to remain apps_incoming there.

kpitzen commented 1 year ago

Thanks for the added context, @justinvp - it seems like we ought to be able to do as much as we can here (in crd2pulumi) and live with the small bit of remaining pain associated with this.

rquitales commented 1 year ago

It looks like there may not be much action item within the crd2pulumi repository in that case. The type names being generated are derived from the Group, Version and Kind of the custom resource in question. To me, it doesn't make sense to convert a snake cased Kind to pascal case if the CRD explicitly defines a CR with snake case. It would only add confusion when users run a Pulumi output and see a mismatch in GVK in the preview, against what is in the live cluster.

For example:

# pulumi preview when we normalise the Pulumi type to pascal case:
pulumi:pulumi:Stack                         template-dev     create     
 +   └─ kubernetes:stable.example.com/v1:CronTab  cronTabInstance  create 

GVK: stable.example.com/v1:cron_tab

Furthermore, official k8s API naming conventions strongly suggest that camel casing should be used for field names.

@kpitzen WDYT would be the next best steps here?

justinvp commented 1 year ago

I think the fix is as simple as this: https://github.com/pulumi/crd2pulumi/commit/f90e016d105d3ed8e11ed0ceb61494878eb2de4c

With that change, I get:

__all__ = [
    'NetworkPolicySpecAppsIncomingArgs',
    'NetworkPolicySpecAppsOutgoingArgs',
    'NetworkPolicySpecNamespacesIncomingArgs',
    'NetworkPolicySpecNamespacesOutgoingArgs',
    'NetworkPolicySpecArgs',
]
justinvp commented 1 year ago

It would only add confusion when users run a Pulumi output and see a mismatch in GVK in the preview, against what is in the live cluster.

This isn't about changing the names of resources. It's about changing the names of nested types, which get erased when the values are serialized and sent to the engine (they essentially turn into a map[string]any). These type names will not be displayed as part of a Pulumi operation.

rquitales commented 1 year ago

@justinvp Ah, gotcha. Yes, I believe that one line fix would do it. I was testing with a simplified CRD, that didn't have nested fields, so there were no nested types to modify. Thanks for the clarification. Do you want to submit a PR for that change?