Closed petemiron closed 6 months ago
Hi @petemiron, do you have any plan to implement this requirement?
Hi @wenzheng, do you have feedback on this? Many of the ideas stem from your Pull Request, but we've tried to balance performance with the flexibility in your suggestion in these requirements. If you agree and would like to modify your PR to suit these requirements, we'd happily review it. We do think this is a great idea, but our core team doesn't have the bandwidth to implement at this point. We just wanted to make sure to capture the suggestions as a set of requirements.
Hi @petemiron
I see our colleague @firebook had commented in the previous PR#429, but yes I think it would be possible for us to modify the PR to suit the requirements, I will talk to our team and see when can we make this happen
👍 for this... This would enable us to provide NATS as a brokered service to our apps running on Cloudfoundry. In fact I could live with a very minimal implementation as in #428.
Hi, We would like to know what is the current status of this issue? We are going to use nats to connect >50k devices from "outer space" and not happy with updating/reloading config files.
@petemiron @derekcollison It's clear that the performance and reliability are top priorities when reviewing external auth feature. But what about having user auth service internally connected to nats with system subscriptions. Have you discussed this possibility already? Are there any stop factors to implement it? The idea is to have have nats client(s) to serve as auth provider. I see it now like having additional subscriptions and/or maybe protocol message, so auth provider(s) is able to announce when ready and be registered by gnatsd.
I believe reloading configuration files, a WIP, will solve the majority of needs here.
@derekcollison At least for our use-case (automatic provisioning of nats subject subtrees to apps on Cloudfoundry & Kubernetes via a service-broker) it feels awkward and error prone to generate nats configs on multiple nodes and then to rely on config reloads.
I would imagine that the process would be automated, where config files are properly generated, updated, distributed securely, and the server's signalled properly to reload the configuration.
It's certainly doable, but something like #428 would be so much simpler and less error prone (and without timing issues - the state of the authn/authz can always authoritatively be answered by the external entity and is not "in flux" during the regeneration/reload of the config).
I think the idea has merit, however it is not complete. For instance, when a user is removed, or permissions updated, these cases are covered by configuration reload, but are currently not accounted for in #428. The synchronization issues are the same in both cases IMO as well as error handling and exceptions.
Removed user or updated permission could be tracked with auth TTL (should have some configured period) if not available/changed - close connection to force client to re-connect. If one don't want TTL defined because of additional traffic, there should be a possibility to receive the message from auth provider and kill connection. It could be webhook or subscription if auth provider is nats client
Can you clarify on the timeline? I can only chime in, that this is much sought after.
something like this would be great, and I imagine quite simple: https://github.com/rabbitmq/rabbitmq-auth-backend-http
any progress?
We are moving forward with this through our Nkey and JWT work. Will keep everyone posted as best we can. Look for something before end of year.
may be you have some testing code in branch? i'm realy want to check this
Here is where we are right now. We have nkey support and account isolation and sharing for a single server. Next up is adding account support to clusters, that begins this week. My target is to be done by end of week. After that is non0-server defined configuration, which will involve work that affects this issue.
FWIW we do the following in BOSH https://github.com/bosh-dep-forks/gnatsd/blob/bosh-1.3.0/auth/certificate_auth.go. We use the cert itself for auth as we need to solve a catch-22 for auth. It works well for us.
We would move to the new pluggable interface when it's completed, thanks for your work on it.
Hello, I have been following this issue a year ago but at that time I managed to avoid requiring authorization in the gnatsd. I know this can sound annoying and I apologize for that, but when could we expect this in the master branch? Not sure whether to start working on some wrapper for auth or wait for this (ideally I would want to use this implementation).
You can do custom authentication now. We do not have a call out option yet. We have added nkeys and are adding decentralized management of JWTs based on nkeys. Will be adding pulling user from x509 cert as well. Which specific problem are you looking to solve?
I need per user control to specify which subjects it can subscribe and publish to.
That already exists today.
Sorry, I forgot to mention that user management is dynamic, they can come and go. I am aware of the file-based configuration (https://www.nats.io/documentation/server/gnatsd-authorization/). Is there an API available (embedded version would work as well as my backend is written in Go and I can start the server internally) on the server which I could use to add/remove users?
Not yet, but one may show up soon. Also note that changes to a config file can be reloaded without server restart with gnatsd -sl reload
Maybe its just me, but i read all of rusenask's comments in the context of external authentication and authorization, but the replies didn't seem to directly address that.
I've been following this issue for a while hoping I could transition to nats
once there's a solution for fine-grained, dynamic run-time authorization. Is there no concrete plan to work towards this functionality, either via call-out or some internal system?
As I understand it, in order for me to make gnatsd -sl reload
for this, I'd need a system to manage client permissions with my own semantic structure, generate a format consumable by gnatsd, distribute that file to members of the cluster, then run reload
on them. Is there an official way of doing this?
Currently there is not a way to do fine grained permissions by having the server do a callout. We will be releasing code that makes the current process work in a kubernetes environment by using the internal methodology of gnatsd as it is today. That will just be abstracted out and you will see a REST gateway to add/remove/change users and their permissions.
With Nkeys and JWTs, these are managed outside of the server and do not require server configuration changes or restarts to add or remove users or update their permissions.
is it going to be a sidecar that keeps the file synced? :)
For kubernetes yes most likely. Will automatically handle updating the config and synching across pods via secrets and having servers reload the file automatically.
ah, makes sense. I initially thought writing a similar sidecar but it would subscribe directly to gnatsd channel for updates and the gateway service would be publishing to that special control channel. Would sidecars in your implementation create a watcher from k8s client go for the secrets or would it get updates through some other mechanism?
Adding in @wallyqs who can shed more light on details.
@rusenask yes it could be done with a sidecar to update a shared secret with the NATS server and then trigger the reload in the servers. There is also some support for dynamic users with Kubernetes Service Accounts using a similar approach in the nats-operator using the service account bound token alpha feature.
In my use I would want to have thousands of dynamic users/tokens with specified permissions. Ideally I wouldn't want to rely on any Kubernetes features even though I am running inside it :/ Would your proposed approach still work for such use case?
I have checked that existing PR with the remote auth and from the thread it seems like CustomClientAuthentication
might be the best option for me. I have already got code for token creation/authentication in my backend that I would have used anyway. I guess time to do some experiments : )
I think CustomClientAuthentication
may be the way to go for your use case.
I wanted to check in with this group since it has been awhile since we have launched NATS 2.0 with decentralized auth via JWTs etc. We also have account isolation for true multi-tenancy but we also use them for system accounts to let nats-servers talk amongst them selves, provide analytics, stream events etc.
We have been chatting about authorization again in the context of NATS 2.0 and things we did right and use cases where we still could improve. Any input appreciated here, and happy to jump on a call if needed.
@derekcollison We would like to use OPA to manage policy. There is currently no way to interface with an external policy engine. I can hop on a call and detail our use case if it helps.
@ripienaar has been looking into that so will let him chime in here.
Reading through the comments here I think the NATS 2.0 work does indeed address the bulk of concerns here, if not I'd be keen to hear what areas need attention.
The remaining is about different ways to express the authorization rules and OPA is a option, I like OPA and while internally we've only bounced this around over drinks so to speak we thought it might be worth either embedding OPA policies into the JWT - so the account owner has an option to be much more flexible than the simple allow/deny rules - but of course also the more traditional OPA agent that can run next to the NATS Server.
@colek42 I'd be interested in hearing more about how in your mind the OPA integration would look, it's something I've been keen to plumb in my self
@ripienaar OPA has a great blog post about doing this with Kafka, https://www.openpolicyagent.org/docs/latest/kafka-authorization/
We are using SPIFFE for node and workload attestation. (I have another issue to use the SVID as identity). We are using OPA/Envoy to manage policy across our restful services, we would like to use the same tool for our message based systems.
This issue is related for our use case. ref: https://github.com/nats-io/nats-server/issues/1325
OK, thanks
The problem with the OPA / Kafka model is that they call OPA on every operation, benchmarks are hard to come by but here is another OPA plugin for Kafka that do have some benchmarks.....and wow, 170 operations/second with latencies measured in milliseconds? That includes a caching layer of looks like 3600 seconds. This wont work for NATS.
Kafka does very different thigns from NATS so thats not a NATS v Kafka thing I am not comparing the tools - it's just a rate of request thing. We can't call a 3rd party service on every message.
Earlier in this thread there is a suggestion that we have something we call at login time, that'd be fine but that's not the OPA model - OPA is generally used to return boolean authorization outcomes.
OPA packages used inside a go app without calling the daemon is quite fast and we could speed things up a lot with some internal caching too but I don't think we will realistically take on OPA as a compiled in dependency
Fast path per message checks to OPA make no sense. If we could query OPA and then have it push any changes in realtime to the servers, maybe even via NATS, that may be something to look into with their team.
That’s just not really how OPA works. It’s a policy as code that is designed to return Boolean to allow it or not.
So it can’t send us updates - it has to evaluate each time - in the Kafka examples they cache decisions for a hour but even that is meh since often people do time based policies.
Yes I know, but at Apcera we had a push based policy system. You essentially registered for interest in a certain policy (think of this as a way to encapsulate into a subject) and changes would be pushed in realtime. Was very scalable. We could potentially work with OPA on something like this. We discussed this with them at the time as well.
OPA can return documents so we could use it for this but would be a bit of a massive downgrade in what people typically do :(
Anyway will POC something up and see
Lets you and I brainstorm a bit and then come up with a plan. IMO low priority but could be interesting. Then we could schedule another call with OPA team.
any news about it?
Still on our list but not super high priority within the ecosystem and our user/customer base.
We will get to it, but if its super important and critical we can try to discuss how we may be able to prioritize over other items.
Hello, I am new to NATS and find it very appealing for my application (real time 3D collaborative app over the web).
The only missing (but critical) feature is customizable authentication. As an example we have users that require Oauth2 authentication (at logging time only) and I do not see how to implement it within nats. (given than users may require a specific way to validate the oauth token, so it requires some kind of server-side scripting/component to implement it).
For now I use wamp/crossbar which is really fine, but its static configuration of realm (equivalent of nats account) is going to be an issue.
@bbdb68 I have solved this problem with custom written NATS relay that was parsing protocol and checking tokens in payload
You can customize permissions with NATS in operator mode using claim JWTs and private nkeys (or designate the JWT to be a bearer token).
Do you own the Oauth2 domain?
Requirements
{ user: 'optional', permissions: { publish: ['foo.*'], subscribe: ['foo.*', 'bar.*'] } }
2.2. The credentials must be checked during client CONNECT. 2.3. The external auth provider may return a Time-to-Live (TTL) for authz data. 2.4. If a TTL is returned, the server should respect the TTL and re-request authn for the user on any new message sent to or received from that user after TTL expiration. 2.5. The external auth provider must provide a means for failover (eg. DNS round-robin, or multiple addresses in the configuration).Plugin Interface Mockup
For discussion, here is a mockup of a plugin interface that passes a context around. This pushes locking responsibility into the plugin itself. It is not complete by any stretch of the imagination.
Related Issues
428
429
369