Modularizing schema and policies

jackwhelpton commented 6 months ago

Describe the feature you'd like to request

For a large application with numerous domains, it would be handy to be able to define separate schemata for each domain and have them merged and applied when validating policy files.

For example, we might define our users and roles in identity.cedarschema.json, and then have entities defined only at the domain level: orders.cedarschema.json, products.cedarschema.json.

orders.cedar would then contain policies that would need to know about entities defined in orders.cedarschema.json and identity.cedarschema.json (users/groups that operate across domains)

Describe alternatives you've considered

Defining the entire schema in one file, but this rapidly becomes unwieldy for large applications, and runs risks of merge conflicts etc.

Additional context

No response

Is this something that you'd be interested in working on?

[ ] 👋 I may be able to implement this feature request
[ ] ⚠️ This feature might incur a breaking change

hakanson commented 6 months ago

@jackwhelpton - This is an interesting topic and broader than the VS Code extension.

This would make a good discussion on the Cedar Slack. There is also a Request For Comments repo for Cedar, where RFC24 proposes a custom syntax for Cedar schemas.

@cdisselkoen and @mwhicks1 may have additional insight here.

john-h-kastner-aws commented 6 months ago

This should be do-able without additional cedar library support. There's already an API for composing schema split across multiple files. E.g.,

fn main() {
    let schema1: cedar_policy::SchemaFragment =
        include_str!("../schema1.cedarschema.json").parse().unwrap();
    let schema2: cedar_policy::SchemaFragment =
        include_str!("../schema2.cedarschema.json").parse().unwrap();
    let schema = cedar_policy::Schema::from_schema_fragments([schema1, schema2]).unwrap();
    let validator = cedar_policy::Validator::new(schema);

    let policies: cedar_policy::PolicySet = include_str!("../policies.cedar").parse().unwrap();

    let result = validator.validate(&policies, cedar_policy::ValidationMode::Strict);
    assert!(result.validation_passed());
}

{
    "Schema1": {
        "entityTypes": {
            "User": {}
        },
        "actions": {
            "act": {
                "appliesTo": {
                    "principalTypes": [
                        "User"
                    ],
                    "resourceTypes": [
                        "Schema2::Resource"
                    ]
                }
            }
        }
    }
}

{
    "Schema2": {
        "entityTypes": {
            "Resource": {}
        },
        "actions": {}
    }
}

hakanson commented 5 months ago

Some comments for discussion:

It seems interesting for the extension to automatically merge all *.cedarschema.json files into a logically Cedar schema JSON file for validation. The extension would need to be smarter about "Go to Definition" features.
What are thoughts on how these would be merged for non-VS Code use? Would you have have a CI/CD pipeline merge them into a physical/on-disk file for deployment and run any integration tests based on that? Would you want/expect similar behavior from the Cedar CLI?
Or, is auto-magic less preferred to some config setting / file that indicates specifically which *.cedarschema.json files to use?

jackwhelpton commented 5 months ago

What are thoughts on how these would be merged for non-VS Code use? Would you have have a CI/CD pipeline merge them into a physical/on-disk file for deployment and run any integration tests based on that? Would you want/expect similar behavior from the Cedar CLI?

I'll admit to being very early in my investigation and voluminous in my ignorance of how the various pieces connect. My aim is to have a sidecar container with access to the policy files (keeping them up-to-date by periodic pulls or, ideally, event-based), and a companion app which submits language-agnostic (but context-dependent) representations of resources for validation against the available policies.

I've yet to map that to the Cedar CLI in detail, but my assumption is that we've moved past schema at this point: the schema files will be used during authoring and (as you say) integration tests for the policy files themselves.

Or, is auto-magic less preferred to some config setting / file that indicates specifically which *.cedarschema.json files to use?

I think this would be my preferred approach, rather than performing an implicit merge... ideally a policy file would be able to define which schemata should be applied to it, so I could easily include the definitions from other domains where needed, but I wouldn't want to assume every domain should be automatically combined.

For example, we might have an "identity" domain that includes all of our definitions for Users, Roles, etc., but perhaps other concepts like "Task" might be defined very differently in different places.

If we wanted something simpler and a bit more automagical, we could probably make this work using dot-separated naming or subfolders and conventions, so that X.cedar loads schema from cedarschema.json and X.cedarschema.json, and then ensure that User, Role etc. were all defined in that "shared" schema.

john-h-kastner-aws commented 3 weeks ago

We've written an RFC proposing a new mechanism for modular schema libraries cedar-policy/rfcs#69. If you still want this feature, you can comment there to let us know of it satisfies your use case.

cedar-policy / vscode-cedar