pulumi / pulumi-azure-native

Azure Native Provider
Apache License 2.0
126 stars 34 forks source link

Stream Analytics 'sku' bug for PostgreSQL Flexible Server Output on v2021-10-01-preview #3122

Open zehndean opened 7 months ago

zehndean commented 7 months ago

What happened?

Using the latest API Version 'v20211001preview' results in an error message from the Azure API, saying that 'sku' is missing from the request.

error: Code="422" Message="The required property 'sku' is missing from the request." Details=[{"code":"422","correlationId":"7390fcac-f601-4189-9ec4-c96032dc012f","message":"The required property 'sku' is missing from the request."

Using pulumi up with highest log verbosity revealed that the structure of the API call looks like this:

__inputs={
    map[
        compatibilityLevel:{1.2}
        contentStoragePolicy:{SystemAccount}
        dataLocale:{en-US}
        inputs:{[
            {map[
                name:{asa-input-name} 
                properties:{
                    map[
                        datasource:{map[
                            type:{Microsoft.EventHub/EventHub}
                        ]}
                    ]
                }
            ]}
        ]}
        jobName:{asa-job-name}
        jobType:{Cloud}
        location:{westeurope}
        outputs:{[]}
        resourceGroupName:{rg-name}
        sku:{
            map[
                name:{Standard}
            ]
        }
        transformation:{
            map[]
        }
    ]
}

Based on investigations with Azure support, 'sku' is expected to be inside 'properties'. Also, the (a bit deprecated) doc for the REST API and the one for ARMT confirms that.

This issue is a summarized followup of the initial analysis done in https://github.com/pulumi/pulumi-azure-native/issues/2835, where the source and reason for the bug has been investigated.

Example

import pulumi import pulumi_azure_native.streamanalytics as stream

import pulumi_azure_native.streamanalytics.v20211001preview as stream

from pulumi import ResourceOptions from string import Template

def generate_streamanalytics_job(login_env, param_set): config = pulumi.Config() param_obj_env = config.require_object(login_env) param_obj_global = config.require_object('GLOBAL')

retrieve general variables

param_env = param_obj_env.get('env')
param_default_location = param_obj_global.get('resource_location')
# retrieve env specific variables
#param_max_throughput_units = int(param_obj_env.get(f'{param_set}_eh_maximum_throughput_units'))
time_delay_and_aggregate_sec = 60

streaming_job = stream.StreamingJob(
    resource_name="streamingJob", # X
    compatibility_level="1.2",
    content_storage_policy="SystemAccount",
    data_locale="en-US",
    events_late_arrival_max_delay_in_seconds=time_delay_and_aggregate_sec,
    events_out_of_order_max_delay_in_seconds=time_delay_and_aggregate_sec,
    events_out_of_order_policy="Adjust",
    job_name="my-test-job", # X
    job_type='Cloud',
    sku=stream.SkuArgs(
        name="Standard",
        #capacity=6
    ),
    location=param_default_location,
    output_error_policy="Drop",        
    identity=stream.IdentityArgs(
        type='UserAssigned',
        user_assigned_identities= {'/subscriptions/<SUBSCRIPTION-ID>/resourcegroups/<RG-NAME>/providers/Microsoft.ManagedIdentity/userAssignedIdentities/<MI-NAME>': {'clientId': '<ID-VALUE>', 'principalId': '<ID-VALUE>'}},
    resource_group_name="<RG-NAME>",
    inputs=[
        stream.InputArgs(
            name="input-tst",
            properties=stream.StreamInputPropertiesArgs(
                datasource=stream.EventHubV2StreamInputDataSourceArgs(
                    type='Microsoft.ServiceBus/EventHub',
                    authentication_mode='Msi',
                    consumer_group_name='stream_analytics_consumer',
                    event_hub_name='<EH-NAME>',
                    service_bus_namespace='<EH-NS-NAME>',
                    #shared_access_policy_name=,
                    #shared_access_policy_key=,
                ),
                serialization=stream.JsonSerializationArgs(
                    encoding="UTF8",
                    type="Json",
                ),
                type="Stream",
            )
        ),
    ],
    outputs=[
        stream.OutputArgs(
            datasource=stream.PostgreSQLOutputDataSourceArgs(
                type="Microsoft.StreamAnalytics/streamingjobs",
                authentication_mode="UserToken",
                database="<DB-NAME>",
                max_writer_count=1,
                password="abcd1234!",
                server="<SRV-NAME>",
                table="<TBL-NAME>",
                user="<USER-NAME>"
            )
        )
    ],
    transformation=stream.TransformationArgs(
    name="transformationtest",
    query="Select Id, Name from input-tst",
    streaming_units=1,
    ),
)
return streaming_job

Output of pulumi about

CLI
Version 3.108.1 Go Version go1.22.0 Go Compiler gc

Plugins NAME VERSION azure 5.62.1 azure-native 2.26.0 azuread 5.47.0 python unknown

Host
OS darwin Version 14.3 Arch x86_64

Additional context

No response

Contributing

Vote on this issue by adding a 👍 reaction. To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).

mjeffryes commented 7 months ago

Looks like according to the spec (https://github.com/Azure/azure-rest-api-specs/blob/14d24d17491d8c2bde24532cb8cc2d663c0ffd9f/specification/streamanalytics/resource-manager/Microsoft.StreamAnalytics/preview/2021-10-01-preview/streamingjobs.json#L698), there's a sku field at the top level and inside the properties. These fields look to be flattened out to a single input and then we're putting the sku field in the top level field location rather than in the nested properties.sku. Looks to be an unfortunate side effect of clashing names during property flattening.

zehndean commented 7 months ago

Hi @mijeffryes Thanks for looking into this. Do you mean that what pulumi sends looks like according to Azure specification? I don't quite understand the part with the second ocurrence of 'sku' within properties...where do you see that?

mjeffryes commented 7 months ago

yes, what Pulumi is sending is valid according to the schema despite not being what the API expects. You can see there's a top level SKU field in line 699: azure-rest-api-specs/specification/streamanalytics/resource-manager/Microsoft.StreamAnalytics/preview/2021-10-01-preview/streamingjobs.json at 14d24d17491d8c2bde24532cb8cc2d663c0ffd9f · Azure/azure-rest-api-specs as well as a properties.sku defined in line 718: azure-rest-api-specs/specification/streamanalytics/resource-manager/Microsoft.StreamAnalytics/preview/2021-10-01-preview/streamingjobs.json at 14d24d17491d8c2bde24532cb8cc2d663c0ffd9f · Azure/azure-rest-api-specs

zehndean commented 7 months ago

So the missing sku within 'properties' is a bug of the 20211001-preview version of Pulumi in that case, right? Is there any schedule already regarding its fix?

stepienpatryk commented 5 months ago

any updates? this bug makes pulumi unusable for stream analytics with user assigned managed identity

zehndean commented 2 months ago

I did some further analysis on this issue.

The Azure REST API expects for both the v2020-03-01 and v2021-10-01-preview versions "sku" within a 'properties' object. The definition for v2021-10-01-preview actually mentions "sku" as second time on the top level but this definition is completely irrelevant. Sending a JSON to the REST API having only the sku within 'properties' works for both versions perfectly fine (the opposite not):

{
    "$schema": "https://schema.management.azure.com/schemas/2019-08-01/deploymentTemplate.json#",
    "contentVersion": "1.0.0.0",
    "resources":
    [
        {
            "type": "Microsoft.StreamAnalytics/streamingjobs",
            "apiVersion": "2021-10-01-preview",
            "name": "my-asa-job",
            "location": "westeurope",
            "properties":
            {
                "compatibilityLevel": "1.2",
                "contentStoragePolicy": "SystemAccount",
                "dataLocale": "en-US",
                "eventsLateArrivalMaxDelayInSeconds": 5,
                "eventsOutOfOrderMaxDelayInSeconds": 0,
                "eventsOutOfOrderPolicy": "Adjust",
                "functions":
                [],
                "inputs":
                [],
                "jobType": "Cloud",
                "outputErrorPolicy": "stop",
                "outputs":
                [],
                "sku":
                {
                    "capacity": 3,
                    "name": "StandardV2"
                }
            }
        }
    ]
}

(sent through az deployment group create --resource-group ABC --template-file XYZ.json)

The referenced REST API schemas confirm that: v2021-10-01-preview & v2020-03-01

From what can be seen in the logs during 'pulumi up', the JSON treats 'sku' identical for both API versions (sku apprears just once, for v2021-10-01-preview with additional "capacity" field): pulumi up --skip-preview --yes --debug -v=9 --logtostderr --logflow > pulumi_log.log:

{
  "location": "westeurope",
  "properties": {
    "compatibilityLevel": "1.2",
    "contentStoragePolicy": "SystemAccount",
    "dataLocale": "en-US",
    "eventsLateArrivalMaxDelayInSeconds": 60,
    "eventsOutOfOrderMaxDelayInSeconds": 60,
    "eventsOutOfOrderPolicy": "Adjust",
    "inputs": [
      {
        "name": "my-input",
        "properties": {
          "datasource": {
            "properties": {
              "authenticationMode": "ConnectionString",
              "consumerGroupName": "reader",
              "eventHubName": "my-eventhub",
              "serviceBusNamespace": "my-eventhub-ns",
              "sharedAccessPolicyKey": "ABC",
              "sharedAccessPolicyName": "stream_analytics_read_access"
            },
            "type": "Microsoft.EventHub/EventHub"
          },
          "serialization": {
            "properties": {
              "encoding": "UTF8"
            },
            "type": "Json"
          },
          "type": "Stream"
        }
      }
    ],
    "jobType": "Cloud",
    "outputErrorPolicy": "Drop",
    "outputs": [],
    "sku": {
      "name": "Standard"
    },
    "transformation": {
      "name": "my-transformation",
      "properties": {
        "query": "SELECT * FROM X",
        "streamingUnits": 3
      }
    }
  },
  "tags": {}
}

Could it be that the error occurs within "Pulumi CrossCode" (https://github.com/pulumi/pulumi/tree/e0ad694d2efb1e0c9870f6d67f9f44786d3f0432/pkg/codegen), where the generic schema mapping does not expect the two times occuring 'sku' property of v2021-10-01-preview and only adds it to the (obsolete) top-level mentioning in the schema?

Unfortunately I could not quite locate how the mapping is done in the GO code and which schema are used (those from azure-rest-api-specs or e.g. azure-resource-manager-schemas.