Open DaveCTurner opened 5 months ago
Pinging @elastic/es-security (Team:Security)
Previously we did have an allowlist for actions on the server side and we decided to remove it in #96936 because we considered it was unnecessary given the available security controls. See also ES-6295. I personally don't see a strong need for it. That said, if the newly introduced RemoteClusterActionType
class can be relied upon for checking whether an action is cross-cluster worthy, I don't mind adding such check on the server side for extra filtering since it seems to be a much lower effort than mantaining an allowlist for individual actions.
++ to what @ywangd said. An allowlist comes with a maintenance burden which is why we removed it. If we can simply check for RemoteClusterActionType
though, the added layer of protection against unsupported actions could be worth it.
I was thinking that since we now have a clean distinction between remote-cluster and local-cluster actions we could reasonably create an entirely separate org.elasticsearch.transport.Transport.RequestHandlers
for remote-cluster requests. That way the maintenance burden doesn't fall on the security team, it's on the developer adding a new remote-cluster action to register the handler appropriately.
Just as another point here, the cost of authorization on every inter-node request is not totally trivial, but it's necessary today because we cannot tell whether the request is coming from within the cluster or outside. If we could be certain that sub-requests (shard-level bulks and individual search phases) were only happening from other nodes in the same cluster, and therefore already properly authorized, we could potentially skip that authn workload.
If we could be certain that sub-requests (shard-level bulks and individual search phases) were only happening from other nodes in the same cluster, and therefore already properly authorized, we could potentially skip that authn workload.
We already have such optimization, but only for search action where we are sure that sub-actions are only accessing the sub-set of indices that were already authorized. See https://github.com/elastic/elasticsearch/pull/91886
Interesting. I opened https://github.com/elastic/elasticsearch/issues/107195.
Today we expose all 732[^1] registered transport actions on the RCS 2.0 interface. However in practice there are only 24[^2] actions which a real Elasticsearch node will invoke on a remote cluster via this interface. Although the usual security model applies to all these unused actions, this is still an unnecessarily large surface to expose to a somewhat-untrustworthy remote cluster which could in principle be modified to invoke an unexpected action. In the spirit of defense-in-depth we should restrict the actions available over RCS 2.0 to just the ones that are needed.
[^1]: at time of writing. [^2]: at time of writing; see calls to the constructor of
org.elasticsearch.action.RemoteClusterActionType
.