Open andreclaro opened 1 year ago
Internal-frontend is for use by the server worker role and history/matching roles only. All calls from clients should go through the normal frontend for proper authorization. In particular, there are no internal calls to RegisterNamespace so that just should't be happening.
I see the migration workflow (running on server worker) uses UpdateNamespace, which would trigger this. Are you using the migration workflow?
It looks this is only a problem if archival is enabled on the namespace.
All calls from clients should go through the normal frontend for proper authorization.
In this case, I think André read the patch notes here about the internal frontend role and thought that it meant you could use it to bypass auth for admin requests as well. However, it seems like this is something we didn't intend to support because this was originally just for simplifying intern-node auth.
I think I see two ways we can go about this:
To me, this originally looked like a bug because we were propagating the "wrong" service name around, but if that's because we have, as a policy, that no one should be calling these APIs on the internal frontend, then it's not a bug. I think we should have a clearer error message, though. Maybe we can do something like check the request metadata to see if calls to the internal-frontend are from another service, and, if not, return an error message? Not as a method of protection, because it could obviously be spoofed, but as a way to detect misuse and inform the user.
In addition, it seems like no other nodes are calling RegisterNamespace on the internal-frontend (which makes sense because this is not something I see us automating). However, if we do in the future, it might make sense to fix this service name issue now (separately from the internal-frontend misuse error message change).
So, @dnr , I think we should do this:
Also, @andreclaro , would you mind elaborating on your use case? Why don't you want to use the frontend service here?
The point about the migration workflow using UpdateNamespace means we do need to fix this.
But to be clear, external clients must not be making calls to internal-frontend, that must be prevented (with network restrictions or mTLS or both), or else there's no point to it.
First of all, thank you for the quick response!
We are using TLS (mTLS to be enabled later), network policies and we are now enabling authorization. We also have archival enabled for both history and visibility.
I totally understand that we shouldn't allow external service to access the internal-frontend.
My initial idea was to use the internal-frontend for administration tasks whenever required (only accessible by administrators with access to exec to the internal-frontend service), however we are planning to build a service to perform that by getting a JTW token from our authorization service.
The other new use case is to use the temporal-operator to manage namespaces but it seems that currently this operator does not support JTW token.
I see. I don't think we'd go out of our way to break calling internal-frontend with tctl/temporal cli, but it's not what it's intended for, in the same way that you generally wouldn't do RPC calls directly to history or matching services. I'd recommend setting up proper authorization and doing administrative tasks through the regular frontend.
Yes, that makes sense. What about point 2 (It looks this is only a problem if archival is enabled on the namespace.
)? Are you going to fix this? thanks!
Expected Behavior
Register/Update Namespace and other methods should be allow to be performed by connecting to the internal-frontend.
Actual Behavior
Currently several actions can only be done by the Frontend service. It we try to connect to the internal-frontend and perform those actions, such as, Register/Update Namespace, we will get the following error:
unable to find bootstrap container for the given service name
As @MichaelSnowden mentioned:
Full error logs:
Steps to Reproduce the Problem
Specifications