NPEP-122 Tenancy API: Add Tenancy API design suggestions.

npinaeva commented 7 months ago

Tracking issue https://github.com/kubernetes-sigs/network-policy-api/issues/122

netlify[bot] commented 7 months ago

Deploy Preview for kubernetes-sigs-network-policy-api ready!

Name	Link
Latest commit	e6e58bc24d2e1f0200881470fa39389077440de8
Latest deploy log	https://app.netlify.com/sites/kubernetes-sigs-network-policy-api/deploys/660e9eced212e200081953bc
Deploy Preview	https://deploy-preview-178--kubernetes-sigs-network-policy-api.netlify.app
Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

astoycos commented 7 months ago

Since this NPEP is now "implementable" let's make sure Sig-Network API approvers have a look as well.

/assign @khenidak

astoycos commented 7 months ago

/assign @thockin

danwinship commented 7 months ago

OK, try on this idea:

There are exactly 2 tenancy use cases:

Overridable isolation: Traffic within tenants should be Allowed as though by a BANP rule which has higher precedence than any explicit BANP Deny rules. Traffic between tenants should be Denied as though by a BANP rule that has lower precedence than any explicit BANP Allow rules.
Strict isolation: Traffic within tenants should be Allowed as though by a BANP rule which has higher precedence than any explicit BANP Deny rules. Traffic between tenants should be Denied as though by an ANP rule that has lower precedence than any explicit ANP Allow rules.

An API that allows exactly those two options and nothing else satisfies all of 4.1-4.4, right?

danwinship commented 7 months ago

Going back to what I was saying above about "cluster default policy", what about:

type ClusterDefaultPolicySpec struct {
        // tenancyLabels defines the labels used for dividing namespaces up into
        // tenants. Namespaces that have the same values for all of the labels in
        // tenancyLabels are part of the same tenant. Namespaces that lack at least
        // one of the labels are not part of any tenant. If this is empty then
        // tenancy-based matching is not part of the cluster default policy.
        TenancyLabels []string

        // userNamespaces defines what constitutes a "user namespace" (as
        // opposed to a "system namespace"). If this is not set then
        // user-namespace-based matching is not part of the cluster default policy.
        UserNamespaces *metav1.LabelSelector

        // ingress defines the cluster default policy for pod-ingress traffic
        Ingress ClusterDefaultPolicyRules

        // egress defines the cluster default policy for pod-ingress traffic
        Egress ClusterDefaultPolicyRules
}

type ClusterDefaultPolicyRules struct {
        // withinUserNamespace defines the default policy for pod-to-pod
        // traffic within a user namespace. This must be nil if the cluster default
        // policy does not define userNamespaces.
        WithinUserNamespace *ClusterDefaultPolicyRule

        // withinTenant defines the default policy for pod-to-pod traffic
        // within a tenant (which was not matched by a previous rule). This
        // must be nil if the cluster default policy does not define
        // tenancyLabels.
        WithinTenant *ClusterDefaultPolicyRule

        // withinPodNetwork defines the default policy for pod-to-pod traffic
        // within the pod network (which was not matched by a previous rule).
        // Note that with some network implementations, it may be inefficient
        // to specify a "withinPodNetwork" rule (rather than just letting the
        // pod network case fall through to "default").
        WithinPodNetwork *ClusterDefaultPolicyRule

        // default defines the default policy for all traffic to a pod (for ingress)
        // or from a pod (for egress) which was not matched by a previous rule.
        // If not specified, this defaults to "DefaultAllow".
        Default *ClusterDefaultPolicyRule
}

type ClusterDefaultPolicyRule string

const (
        // DefaultAllow indicates that traffic should be allowed by default, but may
        // be denied by a BaselineAdminNetworkPolicy, NetworkPolicy, or
        // AdminNetworkPolicy. (This is the Kubernetes default for traffic which
        // does not have any other policy specified.)
        ClusterDefaultPolicyDefaultAllow ClusterDefaultPolicyRule = "DefaultAllow"

        // Allow indicates that traffic should be allowed by default, but may
        // be denied by an AdminNetworkPolicy. (Attempting to deny it with a
        // BaselineAdminNetworkPolicy or NetworkPolicy will have no effect.)
        ClusterDefaultPolicyAllow ClusterDefaultPolicyRule = "Allow"

        // DefaultDeny indicates that traffic should be denied by default, but may
        // be allowed by a BaselineAdminNetworkPolicy, NetworkPolicy, or
        // AdminNetworkPolicy.
        ClusterDefaultPolicyDefaultDeny ClusterDefaultPolicyRule = "DefaultDeny"

        // Deny indicates that traffic should be denied by default, but may
        // be allowed by an AdminNetworkPolicy. (Attempting to allow it with a
        // BaselineAdminNetworkPolicy or NetworkPolicy will have no effect.)
        ClusterDefaultPolicyDeny ClusterDefaultPolicyRule = "Deny"
)

So then, "overridable isolation" is:

kind: ClusterDefaultPolicy
name: default
spec:
    tenancyLabels:
        - "user"
    ingress:
        withinTenant: DefaultAllow
        withinPodNetwork: DefaultDeny
    egress:
        withinTenant: DefaultAllow
        withinPodNetwork: DefaultDeny

strict isolation is:

kind: ClusterDefaultPolicy
name: default
spec:
    tenancyLabels:
        - "user"
    ingress:
        withinTenant: DefaultAllow
        withinPodNetwork: Deny
    egress:
        withinTenant: DefaultAllow
        withinPodNetwork: Deny

"I don't care about the traffic within tenants, but I want to make sure traffic is denied to/from everywhere else (including N/S)" is

kind: ClusterDefaultPolicy
name: default
spec:
    tenancyLabels:
        - "user"
    ingress:
        withinTenant: DefaultAllow
        default: DefaultDeny
    egress:
        withinTenant: DefaultAllow
        default: DefaultDeny

"I don't care about within-cluster traffic, but cluster egress should be blocked by default" is

kind: ClusterDefaultPolicy
name: default
spec:
    tenancyLabels:
        - "user"
    ingress:
        # (doesn't need to be specified; this is the default)
        default: DefaultAllow
    egress:
        withinPodNetwork: DefaultAllow
        default: DefaultDeny

Though this API isn't quite right because you probably always want to DefaultAllow traffic in system namespaces, and this doesn't really give any way to express that...

npinaeva commented 6 months ago

Going back to what I was saying above about "cluster default policy", what about:

I find the fact that this struct introduces invisible priorities the most confusing part, because every ClusterDefaultPolicyRules Action is applied somewhere between ANP, NP and BANP (and priority is implied by action type) but also it has "which was not matched by a previous rule" semantics, which is even more confusing. (but may we can make it somehow more explicit)

"I don't care about the traffic within tenants, but I want to make sure traffic is denied to/from everywhere else (including N/S)" is
name: default
spec:
tenancyLabels:
- "user"
ingress:
withinTenant: DefaultAllow
default: DefaultDeny
egress:
withinTenant: DefaultAllow
default: DefaultDeny

Don't we need Deny instead of DefaultDeny here? I thought this use case wants strict isolation for N/S

I also didn't quite get the usage/meaning of UserNamespaces since there were no examples.

npinaeva commented 6 months ago

@danwinship @astoycos @Dyanngg I tried to sum up provided comments/ideas in a new section called "Another approach". Please take another look and let me know what you think

Dyanngg commented 6 months ago

I find the fact that this struct introduces invisible priorities the most confusing part, because every ClusterDefaultPolicyRules Action is applied somewhere between ANP, NP and BANP (and priority is implied by action type) but also it has "which was not matched by a previous rule" semantics, which is even more confusing. (but may we can make it somehow more explicit)

Huge +1 here. If we are going to introduce new CRD types here (which seems like we are), interoperation between tenancy related objects and AdminNetworkPolicy objects are key here. Any tenancy rules should have a clear and explicit priority so that when there are ANPs on top of these tenancy rules, which would take effect.

tssurya commented 5 months ago

Additionally I am still not on board with creating a new NetworkTenancy object..... (And I think generally most of the others are of the same opinion)

+1 to this from my side

danwinship commented 5 months ago

I am still not on board with creating a new NetworkTenancy object..... (And I think generally most of the others are of the same opinion)

I agree we should not have tenant-specific objects that mirror a large percentage of the functionality of ANP/BANP. But having an object to just define what tenancy means, which would then be referenced by ANPs/BANPs, would make sense. (Individual ANPs don't define which pods are in which namespaces themselves. Why should they define which namespaces are in which tenants themselves?) Or alternatively, having an object with a very restricted set of tenant policy-setting functionality could make sense.

At the base of this we need an easy way to apply rules to a set of namespaces with the ability to change priority of those rules.

Our user stories do not require the ability to set arbitrary priorities on tenancy-related rules.

ANP + BANP already provide this, if the existing semantics of the API don't work let's add a new subject type AND/OR new peer types if needed.

We already tried putting tenancy into the subject/peer framework, and we didn't like how it came out. Any tenancy system that does not involve any new object types seems like it is going to end up being basically equivalent to what we had in the original API with SameLabels/NotSameLabels. It will be a weird peer type that doesn't behave like any of the other peer types, and it will have the problem that different policies can have different definitions of tenancy, even though that makes the implementation more complicated and there are no user stories for that functionality.

danwinship commented 5 months ago

and it will have the problem that different policies can have different definitions of tenancy, even though that makes the implementation more complicated and there are no user stories for that functionality.

In particular... this NPEP started when Nadia pointed out that SameLabels/NotSameLabels tend to result in an explosion of "implementation rules" (i.e. OVN ACLs, iptables rules, etc) because you need to turn the tenancy-based ANP rule into a set of namespace-based implementation rules.

But if we say that there is only a single definition of tenancy for the entire cluster, then that makes it much easier to represent tenancy as a concept within the implementation rules, allowing for more-optimized tenancy matches. Instead of changing "subject and peer are in the same tenant" into "subject's namespace is A or B and peer's namespace is A or B; or subject's namespace is C and peer's namespace is C; or ...", you can just represent it as "subject's tenant == peer's tenant".

To give two examples pertaining to non-existent ANP implementations ( :smile: )

In openshift-sdn, k8s Namespaces all have numeric IDs associated with them, and for pod-to-pod traffic, the namespace ID of the source pod is kept in reg0 and the ID of the destination pod is in reg1, so that namespaceSelector-based NetworkPolicies can be implemented by OVS flows that just check reg0 and reg1 rather than needing to be based on pod IPs. The same approach could be used with "tenant IDs". (In fact, the "namespace IDs" originally were "tenant IDs", in openshift-sdn's "multi-tenant" mode.)
In an nftables-based implementation, if there was only a single definition of tenancy, we could just create a set containing pairs of pod IP addresses for all such pairs of IPs that are in the same tenant. Then a tenancy-matching peer just turns into a single lookup of { ip saddr . ip daddr } in the tenancy set.

(OK, so actually the nftables example could be done even with "every policy creates its own definition of tenancy", by just creating a separate set for each SameLabels/NotSameLabels expression. But you might be less inclined to implement it that way, because those sets would be big. The openshift-sdn example really requires a single cluster-wide definition of tenancy though, since there are only so many registers, so you can't represent "this pod is part of tenant 12 with respect to ANP X, and part of tenant 378 with respect to ANP Y, and ...")

npinaeva commented 5 months ago

Let me try to address the main point here: using (B)ANP rules for tenancy. Current (B)ANP semantics is: subject selects pod to which all ingress and egress rules are applied. Tenancy can't be used in this semantics, because it has its own "subject" defined by the tenancy labels. It doesn't make a lot of sense to allow something like

apiVersion: policy.networking.k8s.io/v1alpha1
kind: AdminNetworkPolicy
spec:
  priority: 10
  subject:
    namespaces:
      matchLabels:
        kubernetes.io/metadata.name: sensitive-ns
  ingress:
    - action: Deny
      from:
      - namespaces:
         namespaceSelector: {}
   - action: Allow
     tenantLabels: ["user"]

because tenancy doesn't have anything to do with the given subject, it just tries to be injected into existing priority system. Therefore, we can only use tenancy in (B)ANP at the same level as subject.

kind: AdminNetworkPolicy
spec:
  priority: 10
  # EITHER
  subject: <...>
  ingress: <...>
  egress: <...>
  # OR
  tenant:
# example contents of tenant, the format is not defined yet
    tenancyLabels:
      - "user"
    ingress: Allow
    egress: Allow

This brings a questions about what is the priority between tenancy and ingress/egress. We can either define it (say tenancy is always a higher prio) or only allow setting one part (see EITHER OR). I prefer the first option, but it will require re-prioritization from the existing implementations (since we need to reserve 1 extra priority before ingress and egress rules that wasn't needed before) If you have other API options that make sense, please let me know, because ^ is the best I can do without adding a new object.

danwinship commented 5 months ago

Right. So I'm leaning toward either:

1. A single external tenant definition, arbitrary rules in ANPs/BANPs:

kind: TenantDefinition
metadata:
  name: default
spec:
  labels: ["user"]

and then (based on your example above):

apiVersion: policy.networking.k8s.io/v1alpha1
kind: AdminNetworkPolicy
spec:
  priority: 10
  subject:
    namespaces:
      matchLabels:
        kubernetes.io/metadata.name: sensitive-ns
  ingress:
    - action: Deny
      from:
      - namespaces:
         namespaceSelector: {}
    - action: Allow
      from:
      - tenant: Same

which I think is coherent: it says "pods in sensitive namespaces allow ingress from pods in their tenant, but not from pods in other tenants".

2. A single external tenant definition, only two possible rules:

kind: TenantDefinition
metadata:
  name: default
spec:
  labels: ["user"]
  tenancyType: Soft

as per https://github.com/kubernetes-sigs/network-policy-api/pull/178#issuecomment-1847426288, where your only options are "hard tenancy" (inter-tenant traffic is denied like with an ANP) and "soft tenancy" (inter-tenant traffic is denied like with a BANP).

(I still like the "ClusterDefaultNetworkPolicy" idea, but I'm giving up on trying to push bigger changes through.)

We could also start with the second approach, and then if it seems too limited, deprecate the tenancyType field in the next alpha and flip to the first approach instead.

Dyanngg commented 5 months ago

I'd personally vote for External tenant definition, arbitrary rules in ANPs/BANPs as I don't feel like tenancy definition should be overloaded with priority. On top of that, I think the above proposal still does not solve the original issue we had with sameLabels: the namespaces selected in the subject field can 1. falls exactly into tenancy definition as all of them have the tenancy label(s) 2. intersect with the tenancy definition but not all of them have the tenancy label(s) 3. disjoint from the tenancy definition. We had a hard time agreeing on what we should do to policy subjects when they don't have the tenancy labels (either no-op for these specific namespaces or group these into a "default" tenant, but none of solutions were explicit and ideal). If we are open to tenant definition as separate object, I will propose something like:

kind: TenantDefinition
metadata:
  name: by-user
spec:
  labels: ["user"]

apiVersion: policy.networking.k8s.io/v1alpha1
kind: AdminNetworkPolicy
spec:
  priority: 10
  subject:
     tenantNamespaces: by-user  # select namespaces by the label list in the by-user tenant definition
                               # in this case, this is equivalent to {namespaces where label "user" exists}
  ingress:
    - action: Allow
      from:
      - tenant: Same  # 'same' and 'notSame' is well-defined here as subjects must have all the label keys
    - action: Deny
      from:
      - namespaces:
         namespaceSelector: {}

In this case, the only thing we would need to document is how subject: tenantNamespaces work, which should be pretty straightforward to explain.

danwinship commented 5 months ago

We had a hard time agreeing on what we should do to policy subjects when they don't have the tenancy labels (either no-op for these specific namespaces or group these into a "default" tenant, but none of solutions were explicit and ideal).

I'm pretty sure we need to answer this regardless of what approach we go with.

The options are:

Some namespaces aren't part of any tenant but you are prevented by validation/admission-control from creating an ANP that uses a tenancy-based rule if the subject selects pods/namespaces that aren't part of a tenant.
- So in this case Nadia's example would be invalid because it's not syntactically guaranteed that the "sensitive-ns" namespace has a "user" label. You'd have to change the subject to have matchExpression: [ { key: "kubernetes.io/metadata.name", operator: "In", values: ["sensitive-ns"] }, { key: "user", operator: "Exists" } ].
- This implies that the tenancy definition has to be inside the ANP, because we generally never do cross-object validation
Some namespaces aren't part of any tenant, but for tenancy rule purposes, all such namespaces form a tenant with each other (the "nil"/"default" tenant)
Some namespaces aren't part of any tenant, but for tenancy rule purposes, each such namespace is treated as its own unique tenant.
Some namespaces aren't part of any tenant, and for tenancy rule purposes, neither sameTenant nor notSameTenant rules will apply to them (SQL NULL / floating-point NaN semantics)
- So in Nadia's example, if the "sensitive-ns" did not have a "user" label, then the "Deny" rule would still apply, but the "Allow" rule would not.
Some namespaces aren't part of any tenant, and any ANP that refers to tenancy at all automatically doesn't apply to them.
- So in Nadia's example, if the "sensitive-ns" did not have a "user" label, then the entire ANP would be a no-op.

On top of that, I think the above proposal still does not solve the original issue we had with sameLabels: the namespaces selected in the subject field can 1. falls exactly into tenancy definition as all of them have the tenancy label(s) 2. intersect with the tenancy definition but not all of them have the tenancy label(s) 3. disjoint from the tenancy definition.

Note that this is more of a problem with some of the options than it is with others.

Dyanngg commented 5 months ago

@danwinship If you look at my proposal, it is actually trying to add a new field in the AdminNetworkPolicy subject, so that there is no chance that namespaces who aren't part of any tenant is selected in tenancy-rule based ANPs. Also Tenancy definitions do not have to be singleton IMO. I think @astoycos commented with the exact same things in https://github.com/kubernetes-sigs/network-policy-api/pull/178/files#r1408291322

Edit: caveat with this approach being some CEL rules needed to make sure that tenant rules can only be present in ingress/egress when subject is specified by tenantNamespaces

npinaeva commented 5 months ago

I think we agreed some time ago that we want to simplify tenancy subject selection as much as possible, which ended up unifying tenancy labels with subject selector (because we didn't have other use cases), so we wanted to avoid "pods in sensitive namespaces allow ingress from pods in their tenant, but not from pods in other tenants". We could have something like

apiVersion: policy.networking.k8s.io/v1alpha1
kind: AdminNetworkPolicy
spec:
  priority: 10
  tenancySubject:
     labels: ["user"]
  ingress:
    - action: Allow
      from:
      - tenant: Same  # 'same' and 'notSame' is well-defined here as subjects must have all the label keys
    - action: Deny
      from:
      - namespaces:
         namespaceSelector: {}

and either make tenancySubject and subject mutually exclusive, and only allow tenancy gress rules with the former, or allow both at the same time and say that tenancy rules have their own subject. Second option is closer to @danwinship suggestion, but more clear because tenancy has its own subject without extra selectors, and the first one is closer to @Dyanngg option. Any option has this either/or switch to turn tenancy mode on, which is a bit ugly, but may be worth it to avoid extra CRDs. It could also be

kind: AdminNetworkPolicy
spec:
  subject:
# EITHER
     tenantNamespaces: 
         labels: ["user"]
# Enables sameTenant rules
# OR
    podSelector/namespaceSelector:

I am pretty sure if we don't want tenancy to have its own priority, then we can avoid creating an extra CRD at all

astoycos commented 5 months ago

I'd personally vote for External tenant definition, arbitrary rules in ANPs/BANPs as I don't feel like tenancy definition should be overloaded with priority.

I'm fairly aligned with @Dyanngg and @danwinship here but, I also don't hate what @npinaeva Just presented, as per usual the smaller API change should be "easier" to get done, I think as long as we can express.

Additionally as it pertains to the "pods not explicitly in a tenant suggestions" that @danwinship presented above I'd lean towards

Some namespaces aren't part of any tenant, but for tenancy rule purposes, all such namespaces form a tenant with each other (the "nil"/"default" tenant)

IMO one of the main use cases that will be used/desired will be, "Lets build a tenant and stop it from communicating with ALL other non-same tenants/pods" and I think this option would be the easiest to parse/ understand for users wanting to build tenants.

So I would like

apiVersion: policy.networking.k8s.io/v1alpha1
kind: AdminNetworkPolicy
spec:
  priority: 10
  subject:
     tenantNamespaces: by-user  # select namespaces by the label list in the by-user tenant definition
                               # in this case, this is equivalent to {namespaces where label "user" exists}
  ingress:
    - action: Allow
      from:
      - tenant: Same  # 'same' and 'notSame' is well-defined here as subjects must have all the label keys
    - action: Deny
      from:
      - namespaces:
         namespaceSelector: {}

To mean the same as

apiVersion: policy.networking.k8s.io/v1alpha1
kind: AdminNetworkPolicy
spec:
  priority: 10
  subject:
     tenantNamespaces: by-user  # select namespaces by the label list in the by-user tenant definition
                               # in this case, this is equivalent to {namespaces where label "user" exists}
  ingress:
    - action: Allow
      from:
      - tenant: Same  # 'same' and 'notSame' is well-defined here as subjects must have all the label keys
    - action: Deny
      from:
      - tenant: NotSame

Otherwise I don't want to muddle up the conversation as I feel like you've all presented really great opinions and points and I think we're close/ headed in the right direction :) It seems like we're centering in on something like Dan's

A single external tenant definition, arbitrary rules in ANPs/BANPs:

Can we try and decide on a way forward before Friday? I'd love to start working on an actual implementation PR

danwinship commented 5 months ago

Any option has this either/or switch to turn tenancy mode on, which is a bit ugly, but may be worth it to avoid extra CRDs.

Well, we were avoiding extra CRDs because that was ugly, so if we agree that the "either/or switch" is ugly then we should ask ourselves if extra CRDs are less ugly...

npinaeva commented 4 months ago

I updated the NPEP, it seems like most suggestions fall into one of the mentioned categories for "Priorities" subsection. From what I understood form the meeting feedback, is that creating or not creating a new CRD is the biggest dilemma, here are the main cons for both options: If we DON'T create a new CRD, we will have some conditional "switch" in the (B)ANP that enables tenant peers only if subject has tenancy type. If we DO create a new CRD, we will need to inject a new priority into existing implementations. It can be done either by "reserving" a priority in (B)ANP stack for tenancy (probably 1 priority before ANP and BANP), or re-using the same priority as (B)ANP for tenancy object, thus injecting it into the existent priority range.

astoycos commented 4 months ago

@npinaeva If we go with @danwinship's

1. A single external tenant definition, arbitrary rules in ANPs/BANPs: ^^https://github.com/kubernetes-sigs/network-policy-api/pull/178#issuecomment-1930447439

Why do we need to "inject a new priority"? Wouldn't priority just be governed by the ANP which references the TenantDefinition?

npinaeva commented 4 months ago

@npinaeva If we go with @danwinship's

1. A single external tenant definition, arbitrary rules in ANPs/BANPs: ^^#178 (comment)

Why do we need to "inject a new priority"? Wouldn't priority just be governed by the ANP which references the TenantDefinition?

We discussed the disadvantages of this approach in the following comments, reflected here https://github.com/kubernetes-sigs/network-policy-api/pull/178/files#diff-4bfb2f198b356812cfea399edc57d4232a6ec7cefcdc221b2fb8531a03360d87R284-R285. It is almost the same as what we have right now, and has the same problems we are trying to solve here.

astoycos commented 4 months ago

OK thanks for clarifying @npinaeva :)

npinaeva commented 4 months ago

Latest update based on the feedback from the SIG meeting on Feb 13. Pros ans cons for every priority option are updated, check Final suggestion to discuss fields of the final CRD version.

k8s-ci-robot commented 4 months ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: astoycos, npinaeva

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/kubernetes-sigs/network-policy-api/blob/main/OWNERS)~~ [astoycos] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment

danwinship commented 4 months ago

OK, Nadia and I met to talk about this.

I don't really like the current "Final suggestion" as proposed, because it makes the common cases complicated because you need both a Tenancy object and a separate ANP or BANP. But as Nadia points out, my version of the proposal doesn't implement all of the user stories we previously agreed upon. But if we do actually agree on all of those user stories, then the API I had proposed wasn't a good API and we should do it differently. :slightly_frowning_face:

Part of the problem is that the existing user stories aren't entirely clear (eg, there were several places where they talk about "allowing" traffic, which I interpreted as "don't interfere with", but in Nadia's examples she interpreted as "action: Allow"). And, eg, Nadia's example for 4.3 used an ANP-level Allow rule, which seemed bizarre to me, but it makes a lot more sense if it was supposed to be BANP-level (which is a use case that Nadia brought up several times in our call: basically, "I want default-deny for everything except within-tenant traffic"). (But even with that interpretation it's still not implementable with my proposed API.)

So maybe we need to revisit the user stories again, at least to clarify the wording. It might also help to explicitly indicate the expected results for every interesting pair of pods in each case:

a pod connecting to a pod in the same namespace in the same tenant
a pod connecting to a pod in a different namespace in the same tenant
a pod connecting to a pod in a different tenant
a pod connecting to a pod in the same namespace, which is not in a tenant
a pod connecting to a pod in a different namespace, neither of which is a tenant
a pod in a tenant connecting to a pod in a non-tenant namespace
a pod in a non-tenant namespace connecting to a pod in a tenant
a pod in a tenant connecting to a cluster-external IP
a pod in a non-tenant namespace connecting to a cluster-external IP

In particular, I feel (though Nadia does not) that some of the user stories lead to "bad" results that probably aren't what any administrator would actually want. (eg, "I want default-deny for everything except within-tenant traffic" means that pod to pod in a tenant namespace is allowed by default, but pod to pod in a non-tenant namespace is denied by default. Nadia says they would just have to create NPs in the non-tenant namespaces in that case, but again, to me that seems like a failure of API design on our part; we should have figured out some way to give them the behavior they wanted in both tenanted and non-tenanted namespaces all at once...)

Anyway, I'm on PTO next week so feel free to merge anything you want while I'm not around to object. :slightly_smiling_face:

npinaeva commented 3 months ago

we should have figured out some way to give them the behavior they wanted in both tenanted and non-tenanted namespaces all at once...

This can be done by assigning every namespace that is not a part of a tenant a unique tenancy label, so making it a one-namespace-tenant. Potentially always-allow-same-namespace policy could be expressed as tenancy with kubernetes.io/metadata.name tenancy label, but if it needs to be combined with another tenancy... well we need 2 tenancies then (and they would need to be allow-based not deny-based)

kubernetes-sigs / network-policy-api