Open jeffgdotorg opened 6 months ago
Add a configuration flag which, when set, completely disables all elements of the default form-based authentication flow. Specifics of proposed behavior:
I am all for it, however a couple of points:
When flag is enabled, we never render the form-based login elements in the web app
We likely need some management commands to:
- When flag is enabled, we remove the route to the
/login/
POST target in the web app (I'm assuming it's not needed for any OIDC flows; if that's incorrect I'll revise)
You would still need this as part of the SAML2/OIDC flow, unless you also use this flag to modify the login button behaviour and force the login to instead redirect you to your OIDC connector directly when there is an authentication "exception" (not logged in, etc)
I do think this is an excellent idea, and it might be better to also wrap it with a way to add more customization to the login as a whole (collapse the form login, allow a corp logo, etc. This would need to be a separate FR of course.)
It would also be great to add a consent banner as well. Like “by logging in you agree to the company policies…”
Add a configuration flag which, when set, completely disables all elements of the default form-based authentication flow.
This has been proposed and rejected in the past for the simple reason that it makes NetBox authentication completely dependent on an external service. An administrator must still have some way to authenticate to NetBox should the external authentication service become unavailable. (This is the same reason network operators configure a backup local account on infrastructure equipment even though centralized AAA is typically in use.)
For this to be accepted in the core product, it must include some mechanism to ensure local authentication is still an option when the external service is not available, at least for administrators. It's not sufficient to simply wrap this with a configuration flag: If you can't authenticate to NetBox because the remote service is unavailable, altering the configuration would require logging into the local shell, modifying the configuration file by hand, and restarting the relevant services.
An administrator must still have some way to authenticate to NetBox should the external authentication service become unavailable.
My view on this is going to be slightly different. The app is down, and requires manual intervention. I don't think it is a huge burden if they need to enter the app to fix authentication to be forced to manually enable it.
This is the same reason network operators configure a backup local account on infrastructure equipment even though centralized AAA is typically in use
While I still do this for certain systems, non-client facing systems that we run actually don't have this. Some of Cisco's gear offers alternative mechanisms for authentication. Those systems fail to the local auth, not have local auth available 100% of the time.
For this to be accepted in the core product, it must include some mechanism to ensure local authentication is still an option when the external service is not available, at least for administrators. It's not sufficient to simply wrap this with a configuration flag: If you can't authenticate to NetBox because the remote service is unavailable, altering the configuration would require logging into the local shell, modifying the configuration file by hand, and restarting the relevant services.
All remote auth is handled in the config file now anyways, so it requires this if everything is down to begin with.
A thought though, and this would require a bit of a re-architecture of the authentication system but what if we did this:
While this requires manual intervention on the part of a system administrator if stuff goes down but it is less burdensome then restarting the app. Also allows for dynamic configuration of the authentication.
This allows:
My view on this is going to be slightly different. The app is down, and requires manual intervention. I don't think it is a huge burden if they need to enter the app to fix authentication to be forced to manually enable it.
Having admin privileges to the NetBox application does not necessarily imply having access to the underlying operating system; modifying the local configuration may require collaboration with another party.
I'll restate my take above as a simple assertion: It must always be possible for a NetBox administrator to access the application, even if a remote authentication service is unreachable.
While I still do this for certain systems, non-client facing systems that we run actually don't have this. Some of Cisco's gear offers alternative mechanisms for authentication. Those systems fail to the local auth, not have local auth available 100% of the time.
That would be an acceptable approach, however the proposal above does not allow for that in its current form. (It would explicitly disable the local login form entirely.)
All remote auth is handled in the config file now anyways, so it requires this if everything is down to begin with.
I don't follow. In the event remote authentication becomes unavailable, local authentication should work automatically.
Add a configuration flag which, when set, completely disables all elements of the default form-based authentication flow.
This has been proposed and rejected in the past for the simple reason that it makes NetBox authentication completely dependent on an external service. An administrator must still have some way to authenticate to NetBox should the external authentication service become unavailable. (This is the same reason network operators configure a backup local account on infrastructure equipment even though centralized AAA is typically in use.)
I disagree with this approach and it doesn't align with other services operators are using. There is an expectation that auth fails if an external auth provider is unavailable - if configured to do so.
There is an expectation that auth fails if an external auth provider is unavailable - if configured to do so.
That may be true for some portion of the user base, but there is likewise an expectation from others (myself included) that the failure of a remote authentication service should not totally preclude the use of the application. And when dealing with two competing use cases, we opt for the least restrictive option. Those who wish to further restrict the availability of the system - and in turn fully accept the liabilities of doing so - are free to modify the application code as they see fit.
That would be an acceptable approach, however the proposal above does not allow for that in its current form. (It would explicitly disable the local login form entirely.)
There generally isn't an easy way to health check an IdP and if there is there may be other security policies preventing it (no outbound access)
That may be true for some portion of the user base, but there is likewise an expectation from others (myself included) that the failure of a remote authentication service should not totally preclude the use of the application.
In my day-to, I work pretty closely with our app team. We do have some systems where our IdP being down means the app is 100% un-usable until it is brought back on by our App administrators.
And when dealing with two competing use cases, we opt for the least restrictive option. Those who wish to further restrict the availability of the system - and in turn fully accept the liabilities of doing so - are free to modify the application code as they see fit.
I think this would apply if it was not a tuneable option, but given that it is a tuneable option I don't think it is unreasonable for there to be a big warning wrapped around this saying "Auth will not work without system adminsitrator level intervention if you turn this on". Anyone turning it on understands and accepts the liability associated with it.
As mentioned, a reasonable compromise might be to move some of the auth configuration to the database and add some management commands to enable them, reducing the need for a restart of the service and manual editing.
Your remote auth being broke generally means you need to edit the config anyways, so you are already engaging support of some kind and as such they could temporarily turn on local auth while troubleshooting. But this may mean they also have to provision user accounts for normal users.
There generally isn't an easy way to health check an IdP and if there is there may be other security policies preventing it (no outbound access)
We would simply need to detect & log failed requests (failures where no response is received, not rejected authentications).
I think this would apply if it was not a tuneable option, but given that it is a tuneable option I don't think it is unreasonable for there to be a big warning wrapped around this saying "Auth will not work without system adminsitrator level intervention if you turn this on". Anyone turning it on understands and accepts the liability associated with it.
A warning is not sufficient. As I've said, I'm not going to accept the implementation of any configuration option which would render NetBox entirely dependent on the accessibility of a third-party system. Nor is there any reason to: An administrator who wishes to disable local authentication can do so today by disabling or deleting all local accounts.
You could even go so far as to say "hide the local login form if there are no active accounts." But outright disabling it asking for trouble and I'm not going to accept that burden of support as a maintainer.
After discussing this FR with Jeremy and other stakeholders, I've edited the issue body to describe a compromise solution that Jeremy identified.
Definitely like the new proposed solution.
POST, PUT, and PATCH requests to /api/users/users result in a 400-series response which, in the GET case, incorporates Warning Copy
Where are groups/permissions added/removed in the API? Is it against this API endpoint or the group/permission one? I feel like we might need to leave part of this open in order to assign permissions but I haven't looked into it deeply.
The new proposed solution did not suit all stakeholders, so we'll need to reorient. I'm leaving this issue open in the meantime.
I like this proposal. When our IdP goes down, most services grind to a halt and that's a good time to go get a coffee. If I need to get Netbox up in an emergency, I can login to the shell and create local users.
As this would be an optional feature, Netbox admins who don't have access to the OS can leave it disabled.
NetBox version
v3.7.6
Feature type
Change to existing functionality
Proposed functionality
Alter the web app so that having no local users with a password configured completely disables all elements of the default form-based authentication flow. Specifics of proposed behavior:
Base behavior
/admin/
in NetBox 3.7)/login/
target in the web app either return a 400-series error (if no session is present) or are redirected with a 300-series response (if a session is present)/admin/login/
target in the web app either return a 400-series error (if no session is present) or are redirected with a 300-series response (if a session is present). Although the Django admin UI is disabled by default in 4.0, it can be re-enabled and we must account for this fact./users/users/add
result in a 400-series response which, in the GET case, incorporates Warning Copy/api/users/users
result in a 400-series response which, in the GET case, incorporates Warning CopyWarning provisions
Documentation coverage
Use case
To clear security compliance audits commonly required in key verticals (e.g. SOC2 and ISO-27001), some NetBox users must use OIDC or SAMLv2 exclusively to authenticate users, and must disable form-based authentication completely. Deleting all local users is not sufficient for some of these use cases; removing the POST targets used by form-based login is key in those environments as auditing tools may consider its presence as an unacceptable risk in a layered defense posture.
Database changes
None
External dependencies
None