Azure / azure-sdk-for-rust

This repository is for active development of the *unofficial* Azure SDK for Rust. This repository is *not* supported by the Azure SDK team.
MIT License
692 stars 237 forks source link

WIP: flatten `allOf` models #1524

Open krishanjmistry opened 8 months ago

krishanjmistry commented 8 months ago

Primarily, this is an attempt to get further on #599.

What the solution has ended up looking like is [flattening/merging/composing] the allOf schema properties into the child models - there is no longer a use of #[serde(flatten)]. For starters, I'd say this provides a neater API to the end user - more resemblant of what we see in Bicep and easier to reason about.

It also removes some ambiguities in the models, where a property could be defined multiple times in the created models. For an example of this, see https://github.com/Azure/azure-sdk-for-rust/pull/1524#issuecomment-1867881285

krishanjmistry commented 8 months ago

Example where generated model is previously ambiguous

Using azure-rest-api-specs/specification/vmware/resource-manager/Microsoft.AVS/stable/2020-03-20/vmware.json - note this is an older tag than the latest, but sufficient as an example

Deserializing: Whilst deserializing this response into a Rust model is successful, it's ambiguous where the provisioningState would end up:

#[cfg(feature = "package-2020-03-20")]
#[test]
fn test_cluster_get() -> anyhow::Result<()> {
    use anyhow::{bail, ensure};
    use azure_mgmt_vmware::package_2020_03_20::models::*;
    // copied from specification\vmware\resource-manager\Microsoft.AVS\stable\2023-03-01\examples\Clusters_Get.json
    let json = br#"
    {
        "id": "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/group1/providers/Microsoft.AVS/privateClouds/cloud1/clusters/cluster1",
        "name": "cluster1",
        "sku": {
          "name": "AV20"
        },
        "properties": {
          "clusterSize": 4,
          "hosts": [
            "fakehost22.nyc1.kubernetes.center",
            "fakehost23.nyc1.kubernetes.center",
            "fakehost24.nyc1.kubernetes.center",
            "fakehost25.nyc1.kubernetes.center"
          ],
          "provisioningState": "Succeeded"
        },
        "type": "Microsoft.AVS/privateClouds/clusters"
      }
      "#;

    let cluster: Cluster = serde_json::from_slice(json)?;
    println!("{:#?}", cluster);
    Ok(())
}
Cluster {
    resource: Resource {
        id: Some(
            "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/group1/providers/Microsoft.AVS/privateClouds/cloud1/clusters/cluster1",
        ),
        name: Some(
            "cluster1",
        ),
        type_: Some(
            "Microsoft.AVS/privateClouds/clusters",
        ),
    },
    sku: Sku {
        name: "AV20",
    },
    properties: ClusterProperties {
        management_cluster: ManagementCluster {
            cluster_update_properties: ClusterUpdateProperties {
                cluster_size: Some(
                    4,
                ),
            },
            provisioning_state: None,
            cluster_id: None,
            hosts: [
                "fakehost22.nyc1.kubernetes.center",
                "fakehost23.nyc1.kubernetes.center",
                "fakehost24.nyc1.kubernetes.center",
                "fakehost25.nyc1.kubernetes.center",
            ],
        },
        provisioning_state: Some(
            Succeeded,
        ),
    },
}

Serializing:

#[cfg(feature = "package-2020-03-20")]
#[test]
fn serialize_cluster() -> anyhow::Result<()> {
    use anyhow::{bail, ensure};
    use azure_mgmt_vmware::package_2020_03_20::models::*;

    let cluster = Cluster {
        resource: Resource {
            id: Some(
                "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/group1/providers/Microsoft.AVS/privateClouds/cloud1/clusters/cluster1".to_string(),
            ),
            name: Some(
                "cluster1".to_string(),
            ),
            type_: Some(
                "Microsoft.AVS/privateClouds/clusters".to_string(),
            ),
        },
        sku: Sku {
            name: "AV20".to_string(),
        },
        properties: ClusterProperties {
            management_cluster: ManagementCluster {
                cluster_update_properties: ClusterUpdateProperties {
                    cluster_size: Some(
                        4,
                    ),
                },
                provisioning_state: Some(ClusterProvisioningState::Updating),
                cluster_id: None,
                hosts: vec![
                    "fakehost22.nyc1.kubernetes.center".to_string(),
                    "fakehost23.nyc1.kubernetes.center".to_string(),
                    "fakehost24.nyc1.kubernetes.center".to_string(),
                    "fakehost25.nyc1.kubernetes.center".to_string(),
                ],
            },
            provisioning_state: Some(
                ClusterProvisioningState::Succeeded,
            ),
        },
    };

    let json = serde_json::to_string_pretty(&cluster)?;
    println!("{}", json);
    Ok(())
}

This means we end up with two provisioningState properties within the serialized result.

{
  "id": "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/group1/providers/Microsoft.AVS/privateClouds/cloud1/clusters/cluster1",
  "name": "cluster1",
  "type": "Microsoft.AVS/privateClouds/clusters",
  "sku": {
    "name": "AV20"
  },
  "properties": {
    "clusterSize": 4,
    "provisioningState": "Updating",
    "hosts": [
      "fakehost22.nyc1.kubernetes.center",
      "fakehost23.nyc1.kubernetes.center",
      "fakehost24.nyc1.kubernetes.center",
      "fakehost25.nyc1.kubernetes.center"
    ],
    "provisioningState": "Succeeded"
  }
}
krishanjmistry commented 8 months ago

Now, the example I've used in https://github.com/Azure/azure-sdk-for-rust/pull/1524#issuecomment-1867881285

serializes as:

#[cfg(feature = "package-2020-03-20")]
#[test]
fn serialize_cluster() -> anyhow::Result<()> {
    use azure_mgmt_vmware::package_2020_03_20::models::*;

    let cluster = Cluster {
        id: Some(
            "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/group1/providers/Microsoft.AVS/privateClouds/cloud1/clusters/cluster1".to_string(),
        ),
        name: Some(
            "cluster1".to_string(),
        ),
        type_: Some(
            "Microsoft.AVS/privateClouds/clusters".to_string(),
        ),
        sku: Sku {
            name: "AV20".to_string(),
        },
        properties: ClusterProperties {
            cluster_size: 4,
            provisioning_state: Some(ClusterProvisioningState::Updating),
            cluster_id: None,
            hosts: vec![
                "fakehost22.nyc1.kubernetes.center".to_string(),
                "fakehost23.nyc1.kubernetes.center".to_string(),
                "fakehost24.nyc1.kubernetes.center".to_string(),
                "fakehost25.nyc1.kubernetes.center".to_string(),
            ],
        },
    };

    let json = serde_json::to_string_pretty(&cluster)?;
    println!("{}", json);
    Ok(())
}

into

{
  "sku": {
    "name": "AV20"
  },
  "properties": {
    "provisioningState": "Updating",
    "hosts": [
      "fakehost22.nyc1.kubernetes.center",
      "fakehost23.nyc1.kubernetes.center",
      "fakehost24.nyc1.kubernetes.center",
      "fakehost25.nyc1.kubernetes.center"
    ],
    "clusterSize": 4
  },
  "id": "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/group1/providers/Microsoft.AVS/privateClouds/cloud1/clusters/cluster1",
  "name": "cluster1",
  "type": "Microsoft.AVS/privateClouds/clusters"
}

deserializes as:

#[cfg(feature = "package-2020-03-20")]
#[test]
fn test_cluster_get() -> anyhow::Result<()> {
    use anyhow::ensure;
    use azure_mgmt_vmware::package_2020_03_20::models::*;
    // copied from specification\vmware\resource-manager\Microsoft.AVS\stable\2023-03-01\examples\Clusters_Get.json
    let json = br#"
    {
        "id": "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/group1/providers/Microsoft.AVS/privateClouds/cloud1/clusters/cluster1",
        "name": "cluster1",
        "sku": {
          "name": "AV20"
        },
        "properties": {
          "clusterSize": 4,
          "hosts": [
            "fakehost22.nyc1.kubernetes.center",
            "fakehost23.nyc1.kubernetes.center",
            "fakehost24.nyc1.kubernetes.center",
            "fakehost25.nyc1.kubernetes.center"
          ],
          "provisioningState": "Succeeded"
        },
        "type": "Microsoft.AVS/privateClouds/clusters"
      }
      "#;

    let cluster: Cluster = serde_json::from_slice(json)?;
    println!("{:#?}", cluster);
    ensure!(cluster.name == Some("cluster1".to_string()));
    ensure!(cluster.sku.name == "AV20".to_string());
    ensure!(cluster.properties.cluster_size == 4);
    ensure!(cluster.properties.hosts.len() == 4);
    ensure!(cluster.properties.provisioning_state == Some(ClusterProvisioningState::Succeeded));
    Ok(())
}
cataggar commented 8 months ago

Example where generated model is previously ambiguous

Using azure-rest-api-specs/specification/vmware/resource-manager/Microsoft.AVS/stable/2020-03-20/vmware.json - note this is an older tag than the latest, but sufficient as an example

Hi @krishanjmistry, I appreciate using the vmware spec as the example, since that I the one I manage. Having duplicate entries in the spec like this is an error. I am not sure it is something that we should support. In this particular case, I fixed it in the next version 2021-06-01.

cataggar commented 8 months ago

For starters, I'd say this provides a neater API to the end user - more resemblant of what we see in Bicep and easier to reason about.

I think there is room for improvement here. I think there are some advantages of keeping the structs to match the definitions. There are ways to improve the API to the end user without changing the structs. For #601, I brought up using the builder pattern. @demoray, then mentioned:

I think having a middle ground of providing something akin to the setters! macro would be sufficient for many (most?) uses.

krishanjmistry commented 8 months ago

@cataggar, thanks for taking a look.

Hi @krishanjmistry, I appreciate using the vmware spec as the example, since that I the one I manage. Having duplicate entries in the spec like this is an error. I am not sure it is something that we should support. In this particular case, I fixed it in the next version 2021-06-01.

I did notice that I was picking on an old tag! More generally, the main thing that I believe would influence whether we support this case is how prevalent this is in the existing specs. I think it's worth noting that I don't think such a thing could be described in TypeSpec (which I believe is where the specs are heading).

Likewise, I also don't think the scenario in #599 could be described in TypeSpec. In TypeSpec, it seems a property can either be defined required or optional in its base model. In other words, it should be possible to inherit the required properties, but not overwrite them.

I think there is room for improvement here. I think there are some advantages of keeping the structs to match the definitions. There are ways to improve the API to the end user without changing the structs.

My main concern here would be, in the case of if we do want to support a scenario like #599 - the models won't reflect what gets [de]serialized. I do like the suggestion of a builder pattern/setters! macro.

It seems there's two questions, on both I'll try and get some stats:

  1. how prevalent are duplicated properties in models that include allOf?
  2. how prevalent is the issue in #599, where a required property is not set in the base class?
krishanjmistry commented 7 months ago

Answering the questions I left at the end of the last comment.

  1. how prevalent are duplicated properties in models that include allOf?

    • 39 resource manager crates have schemas where the property is defined in both the schema, and the allOf it takes.
    • Among those 39 resource manager crates, there's 435 cases in total.
    • Here's a CSV containing all the places: all_of_issues.csv
  2. how prevalent is it that a property is mentioned to be required outside of the base class?

    • 46 resource manager crates
    • 4 data plane crates
    • 1193 cases across both sets of crates
    • Here's a CSV containing all the places: required_issues.csv
heaths commented 7 months ago

FWIW - and I don't know if/how this factors in here - but for context, all the other language code generators have traditionally treated the first allOf as derivation (if that even applies in this case) and any subsequent allOfs as flattened members. Also worth keeping in mind when we start supporting a TypeSpec-compiled object model.