Closed davidhadas closed 11 months ago
As an operator why would I choose minimal
over enabled
?
As an operator why would I choose
minimal
overenabled
?
Minimal
equals what we have had as "internal-encryption" (which we deprecated). This is the first TLS support we will be able to release. Once we complete the work for Enabled (all ingress types etc.), enabled
will be recommended to replace it.
You will add dataplane-trust = "identity"
as well, won't you?
You will add dataplane-trust = "identity" as well, won't you?
The code also seems to support what we need for a future dataplane-trust = "identity" since identity adds on top of mutual trust. The main difference between the two appears at the ingress. However, the focus now is on TLS and verifying identities at each hop.
The suggested wip #13969 treats:
So identity and mutual are the same at this point when we make changes to the activator and queue
Minimal equals what we have had as "internal-encryption" (which we deprecated).
Given internal encryption is 'alpha' I wonder if we skip supporting Minimal
in our redux. It might be better to just remove it now.
@nak3 do you have input on this feature's usage
Yeah, I think it is alright to replace current internal-encryption with enabled
if the new configuration will be implemented smoothly.
@nak3
Yeah, I think it is alright to replace current internal-encryption with enabled if the new configuration will be implemented smoothly.
I can't see a way to ensure that the new configuration will be implemented smoothly at this point.
Minimal
is our only option for enabling support without mandating an activator on-route for all requests.
Enabled
without mandating an activator on-route is not something I expect to be available any time soon.
Moving the original internal-encryption
forward (now renamed as Minimal
) will allow us to introduce end-to-end encryption and move it from Alpha to Beta as soon as the PRs for activator and queue are approved. The other trust levels will evolve at a later time.
If the new configuration will not be implemented smoothly, I think we should keep Minimal
.
With said, for moving it from Alpha to Beta, we will add Enabled
level configuration as per https://github.com/knative/serving/issues/11906 ? We will need to re-discuss about Beta criteria if we change the criteria.
With said, for moving it from Alpha to Beta, we will add
Enabled
level configuration as per #11906 ? We will need to re-discuss about Beta criteria if we change the criteria.
internal-encryption
which is now called Minimal
.Minimal
Support for Minimal
is therefore in line with the defined criteria, removing "Minimal" means we no longer meet the criteria indicated in #11906.
Sorry I referred to this specific comment https://github.com/knative/serving/issues/11906#issuecomment-1415654516
With said, for moving it from Alpha to Beta, we will add
Enabled
level configuration as per #11906 ? We will need to re-discuss about Beta criteria if we change the criteria.
When you say Enabled
level configuration, do you mean enabling encryption by default? I think that's a good target for Beta, though some KIngress implementations (gateway API) may need to disable for now.
Sorry I misunderstood about Implement SNI in activator to avoid activator always in path
. David explained about it and it is clear now.
We talked a bit about this in the security WG meeting. To summarize, we believe that all the following are true:
minimal
version, which uses the same certificate across all namespacse. We may want to pick a different name that's slighly less confusing, something like shared
or bootstrap
.
target-burst-capacity
. Activator bypass is currently disabled when TLS is enabled.enabled
and high-volume
, we either need ingresses to support requesting multiple subjects, or to implement SNI on the activator
(or to generate a huge number of subjects on the activator cert).Given that we're currently publishing support for the minimal
level, we need a path forward with compatibility for any existing adopters. The security WG is recommending that we do the following upgrade path, probably in separate releases:
minimal -> enabled -> high-volume
Once enabled
has been rolled out, we should be able to retire the minimal
feature in the following release, possibly coordinated with the release of the high-volume
feature.
Does that make sense? Is there other data that we're missing?
To clarify, the recommendation for 2023, coming out of the recent Security WG meeting is:
Minimal
while supporting high-volume
as our default GA - making this the default for Serving is a huge step forward in the security offered to the Knative community. Trust=Minimal
is only slightly less secure than Trust=Enabled
and on the other hand, offers a significant advantage over the current Non-TLS support. high-volume
, we recommend to also support Trust=Enabled
as a configuration option. When used, this option should disable high-volume
support.
Both the recommended default (Trust=Minimal
with high-volume
), and the more secure option (Trust=Enabled
without high-volume
) are within reach in the near future and can be GAed in 2023. As a future plan, we recommend to continue making an effort to support Trust=Enabled
with high-volume
support. Such work may depend on support from Ingress GW API and/or other external dependencies which are unlikely to be resolved at any near future. Hence such support may or may not be available in 2024/5/6/... Even without such work, the move to end to end encryption with Trust=Minimal
is a significant step forward in the security offered to the community.
Given that we're currently publishing support for the minimal level, we need a path forward with compatibility for any existing adopters.
This is where I disagree. Only Kourier supports the internal encryption feature. Contour has code but we have zero test coverage. This feature is alpha and we have affordances to forgo the current implementation and skip this complex migration. Additionally, there is no documentation on the website about this feature.
high-volume
I'm confused is this a configuration option the user sets explicitly?
This seems unnecessary for Istio and Kourier because they can support multiple SANs for a backend the activator should just move in and out of the data path when Trust=enabled
For GatewayAPI & Contour we should default the activator to always be in the data path when trust=Enabled
, document this clearly on the website, and then enable the activator movement transparently for the user when those implementations support multiple SAN in the future.
I'm not convinced that we should support trust=Minimal
in order to have 'high-volume' support - we have two ingress implementations end-users can use to achieve their security & volume requirements.
As indicated above the Security WG opted to make TLS our default sooner rather than later (and for very good reasons). From our analysis, the path for getting to TLS as the Serving default is using Minimal
. Also, we see the need to support Minimal
for many years to come and do not see an option to have any other TLS default for Serving.
@dprotaso (and please correct me if I am misinterpreted your thoughts) clearly prefers to drop Minimal
and have TLS supported with Enabled
. @dprotaso sees Minimal
as a transient phase that is confusing and need to be dropped.
I think we all agree on the facts (I listed everything I could collect below), and that we have two options:
Minimal
Enabled
Minimal
Minimal
as our default in Serving for many years to comeEnabled
I believe we all agree on the many items below (please speak out if not):
Minimal
is dramatically more secure than non-TLS option. It offers data encryption in-transit and prevents MITM for an attacker that can manipulate the cluster network but have no access to any of the cluster secrets.Enabled
is more secure than Minimal
as it prevents a certain attack vector where the attacker has access to one secret in one namespace and can manipulate the the cluster network to perform MITM attack. (With Enabled
the attacker needs access to the specific secret on the namespace being attacked).Minimal
is doable in a relatively short time, and is a manageable effort - (seems as if we could have been there already if it was not for the disagreement as for the path forward). Minimal
supports high-volume
for all ingresses - The term high-volume
refers to the case where we do not mandate activator in path for all requests, and instead support direct communication between the ingress and queue. Minimal
to GA and make it the default for Serving. Enabled
with high-volume
is apparently doable at present in Kourier
and Istio
but we did not POC it and it represent another gap we need to close to reach GA with Enabled
Enabled
with high-volume
is apparently not doable in Contour
, Ingress GW
any time soon. Instead what we will be required to do is to enforce all requests to go via the activator for these ingresses. Enabled
as our default before we can solve how we can support high-volume
for all ingresses - (not an option I expect we will have in the next 1,2,3... years). This means we will continue to use non-TLS as our default. Minimal
, Enabled
, Mutual
, ... and merged a number of PRs in a couple of packages. We also created #14063 and #13969 for Serving for that purpose. Still, although we seem to all agree on the above list, we disagree on the question of what should we do.
I am opting for a decision (one way or another) such that I can clean my desk form this effort as I am gradually focusing more and more on Zero-Trust aspects of K8s and Knative.
@evankanderson, @dprotaso, @nainaz @ReToCode @skonto @nak3, @rhuss, @psschwei
We do not identify anything preventing us to move Minimal to GA and make it the default for Serving. ... I am opting for a decision (one way or another) such that I can clean my desk form this effort as I am gradually focusing more and more on Zero-Trust aspects of K8s and Knative.
I've been pretty clear that we should drop Minimal
for the past few months now. I don't understand why we're ignoring my feedback and revisiting this discussion every two weeks.
Closing the gaps to reach a GA with Minimal is doable in a relatively short time, and is a manageable effort - (seems as if we could have been there already if it was not for the disagreement as for the path forward).
Like I stated above - Enabled
should be fairly easy to support today since multiple upstream SAN are supported by Kourier & Istio. Users would then have two options to run Knative securely and where the activator doesn't need to be in the data plane (aka. high-volume).
For Contour/Gateway API we can wait until their API can support such a change. It would be great if folks could engage/discuss/contribute in those respective projects and communities.
I've been pretty clear that we should drop Minimal for the past few months now. I don't understand why we're ignoring my feedback and revisiting this discussion every two weeks.
@dprotaso - No-one ignores your feedback. In parallel, no-one seem to be able to make sense as for why Option A is your choice over Option B - I have talked to almost anyone related to this effort to try to figure this out.
Since this is going nowhere, lets close the 2 respective PRs, consider it a POC that others can use as reference future Serving TLS support.
no-one seem to be able to make sense as for why Option A is your choice over Option B
What's not clear about my position?
I've stated previously:
Enabled
is not suitable to be used as Serving default before all ingresses can support Enabled
with high-volume
.
It is very possible it will never be suitable as the default (e.g. if Gateway API will not support it). This is why Security WG chose after much deliberation to keep the Minimal
option.
Minimal
in many ways was designed as a subset of Enabled
, same as Enabled
being a subset of Mutual
and Mutual
being subset of Identity
etc. - the claim about the extra work/testing/maintenance to support Minimal
is not accurate and at the same time one can make similar arguments to remove Enabled
and make Mutual
the default, or maybe remove Mutual
also and make Identity
(which is not yet designed) to be the default.
It boils down to 'A Bird in the Hand is Worth Two in the Bush'.
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen
. Mark the issue as
fresh by adding the comment /remove-lifecycle stale
.
See serving #11906 See https://docs.google.com/document/d/1XE7UzgQlVVtAb7ULSqOyKCaIHtm8zMF35ainp1JmwyY/
This issue focuses on adding
DataPlan-Trust
support for Activator and Queue including options for: dataplane-trust = "minimal" (common names for all namespaces) dataplane-trust = "enabled" (per namespace) dataplane-trust = "mutual" mTLSIt includes the necessary changes needed for: