cedar-policy / cedar

Implementation of the Cedar Policy Language
https://www.cedarpolicy.com
Apache License 2.0
807 stars 68 forks source link

Converting a JSON-syntax schema to human syntax and back can change the target of an entity type reference #1063

Open cdisselkoen opened 1 month ago

cdisselkoen commented 1 month ago

Before opening, please confirm:

Bug Category

Schemas and Validation

Describe the bug

For the following schema:

{
  "": {
    "entityTypes": {
      "User": {}
    },
    "actions": {}
  },
  "NS": {
    "commonTypes": {
      "User": { "type": "String" }
    },
    "entityTypes": {
      "Foo": {
        "shape": {
          "type": "Record",
          "attributes": {
            "owner": { "type": "Entity", "name": "User" }
          }
        }
      }
    },
    "actions": {}
  }
}

When converted to natural/human syntax, we get

namespace NS {
  type User = __cedar::String;
  entity Foo = {"owner": User};
}
entity User;

(I added whitespace for clarity.)

When converting that back to JSON syntax, we get

{
  "NS": {
    "commonTypes": {
      "User": { "type":"String" }
    },
    "entityTypes": {
      "Foo": {
        "shape": {
          "type": "Record",
          "attributes": {
            "owner": { "type": "User" }
          }
        }
      }
    },
    "actions": {}
  },
  "": {
    "entityTypes": {
      "User": {}
    },
    "actions": {}
  }
}

(again, I added whitespace for clarity.)

This is not the same as the original schema: In the original schema, Foo.owner had the type User, an entity type in the empty namespace. In the final schema, Foo.owner has the type NS::User, a common type which resolves to String.

Due to the under-the-hood implementation where natural-syntax schemas are first translated into the JSON structures before processing, I'm pretty sure that in 3.2.1, the behavior of the natural-syntax schema is the same as that of the final schema. An important question is whether this is the correct behavior for the natural-syntax schema, according to the spec. I believe it is (even after #579 is fixed), because priority is given to definitions in the current namespace over definitions in the empty namespace. (RFC 24 also specifies that common-type definitions take priority over entity-type definitions, but I interpret that to be "in the same namespace", so it doesn't apply here. I also apply that interpretation in #1060.)

Assuming that the 3.2.1 interpretation of this natural-syntax schema is correct, the problem is in the first step above (converting from JSON to natural syntax), so probably in json_schema_to_custom_schema_str() in human_schema/fmt.rs. This means the bug only affects the translation from JSON to human/natural syntax, and not the interpretation of any JSON or natural-syntax schemas in 3.2.1.

Expected behavior

Converting the JSON schema to natural/human syntax and back should yield the same schema, or at least an equivalent one.

Reproduction steps

  1. I reproduced this with the 3.2.1 CLI. cargo install cedar-policy-cli@3.2.1
  2. Save the initial schema above as cedarschema.json
  3. cedar translate-schema --direction json-to-human --schema cedarschema.json > cedarschema
  4. cedarschema should match what's written above, modulo whitespace
  5. cedar translate-schema --direction human-to-json --schema cedarschema > cedarschema_roundtripped.json
  6. cedarschema_roundtripped.json should match the final schema above, modulo whitespace

Code Snippet

No response

Log output

No response

Additional configuration

Reproduced on the 3.2.1 CLI

Operating System

No response

Additional information and screenshots

No response

cdisselkoen commented 1 month ago

Noting that this issue is moot with RFC 70, as all three of the schemas above are invalid after RFC 70. However, we should still consider whether we need to fix this in 3.x (or any other versions that are released before RFC 70 ships, assuming it does indeed eventually ship).

shaobo-he-aws commented 1 week ago

I think a proper fix is essentially RFC 70.

cdisselkoen commented 1 week ago

Correct, the only question is do we want a (different, backwards-compatible) fix for 3.x