SUSE / velum

Dashboard for CaaS Platform clusters (v1, v2 and v3)
https://www.suse.com/
Apache License 2.0
54 stars 30 forks source link

OIDC Connector support #665

Closed dannysauer closed 5 years ago

dannysauer commented 5 years ago

This set of changes implements external OIDC connector support within Velum. It's based upon the LDAP connector, and should incorporate the majority of the relevant lessons learned from that implementation.

vitoravelino commented 5 years ago

Does it make sense to have this Validate button if the validation happens for both Save and Validate submissions? The only difference is that when clicking on Save besides validating it also tries to save the data in the database. Why do we need this additional step if we are not doing anything special?

dannysauer commented 5 years ago

Does it make sense to have this Validate button if the validation happens for both Save and Validate submissions? The only difference is that when clicking on Save besides validating it also tries to save the data in the database. Why do we need this additional step if we are not doing anything special?

The button is primarily there for consistency with the LDAP provider. Personally, I kind of like that it gives the user some more explicit indication that we are going to ensure that their values make sense before saving. But it is technically redundant.

Long term, we discussed potentially adding an additional authentication step to validation, where we would connect to the OIDC Provider using the provided client information and attempt to do a real authentication with a real user. Right now, minimal validation only verifies that the issuer matches the provider URL. In that possible future state, we would want to be able to isolate the interactive authentication from the database validation. Keeping the validate button in place gives the user a consistent experience (normally, they won't know that "save" actually revalidates since they can't click the save button until data is valid) while making it easier to integrate this enhanced validation later.

dannysauer commented 5 years ago

What's the right place to record the desired change to the LDAP connector to match 048a640 (there are three hr+h4 instances)? Would that be an issue opened in Github?

vitoravelino commented 5 years ago

Long term, we discussed potentially adding an additional authentication step to validation, where we would connect to the OIDC Provider using the provided client information and attempt to do a real authentication with a real user. Right now, minimal validation only verifies that the issuer matches the provider URL. In that possible future state, we would want to be able to isolate the interactive authentication from the database validation. Keeping the validate button in place gives the user a consistent experience (normally, they won't know that "save" actually revalidates since they can't click the save button until data is valid) while making it easier to integrate this enhanced validation later.

If in the short term this will be just a redundancy, I'm afraid that this will be confusing more than helping. If nothing special is going on at the moment, my suggestion is to remove it and keep it simple. Whenever we add something more advanced that would require the need of a validate button, then we can add it back.

vitoravelino commented 5 years ago

What's the right place to record the desired change to the LDAP connector to match 048a640 (there are three hr+h4 instances)? Would that be an issue opened in Github?

A github issue is fine I guess. Just to keep up aware of that.

dannysauer commented 5 years ago

The validate button is a technical redundancy, to be sure, but it maintains user-visible workflow consistency with the LDAP interface already deployed. As such, I'm pretty strongly inclined to at least initially make new stuff look the same as existing.

I'd suggest that integrating validation into the save process from a UI perspective is probably better done in a future change which updates both interfaces simultaneously, rather than mixing new and old approaches in the same build. Maybe at the same time as those hrs are removed from LDAP. :)

vitoravelino commented 5 years ago

The validate button is a technical redundancy, to be sure, but it maintains user-visible workflow consistency with the LDAP interface already deployed. As such, I'm pretty strongly inclined to at least initially make new stuff look the same as existing.

The consistency you are talking about is both pages having a button that validates something, right? Because both buttons do different things. In the LDAP, it doesn't refresh the page and validates the connection based on the input fields. In the OIDC, it refreshes the whole page (you probably don't notice because it's localhost and your resolution is large and the form small), it validates the reachability of the OIDC provider based on a single field and also validates any data from the model that will be saved in the database.

Are you still sure that there's consistency between both pages?

If you really want to have a button to validate/check the provider reachability, I would move it next to the provider url input field and rename it to Check reachability. When the user clicks on it, the check would happen without refreshing the page just like the LDAP. Otherwise I would just remove it and rely on the Save button as it is. I think that makes much more sense but it's probably out of the scope for this PR

I'd suggest that integrating validation into the save process from a UI perspective is probably better done in a future change which updates both interfaces simultaneously, rather than mixing new and old approaches in the same build. Maybe at the same time as those hrs are removed from LDAP. :)

You mean "integrating validation into the save process" for the LDAP page, right? Because for OIDC it's already done and we should stick with that IMO.

dannysauer commented 5 years ago

The consistency you are talking about is both pages having a button that validates something, right?

Correct: I'm referring to consistency in user workflow. Fill out form, click validate, click save. Granted the OIDC page does reload, but I'm inclined to think the different buttons are more likely to cause a user to think "why isn't this tested like LDAP is?" while they're less likely to notice (or at least think much about) a reload.

Speaking of which, perhaps the validate button on OIDC should be changed to "Test Connection" to match LDAP. Whoops. :)

You mean "integrating validation into the save process" for the LDAP page, right? Because for OIDC it's already done and we should stick with that IMO.

Yes - I personally think validating server-side is the right decision in both cases. Honestly, moreso with LDAP, since there's no reason that the web client needs to reach the LDAP server to begin with; it's within reason that an LDAP auth server might be firewalled and not accessible by desktops. With OIDC, the browser will almost certainly also have to reach the provider at some point so the user can auth.

To that end, when that future PR which moves LDAP validation server-side happens, it would be trivial to both remove this one's enabled attribute from the save button & remove the validate button entirely at the same time, keeping both pieces visually congruent with minimal effort.

dannysauer commented 5 years ago

PR #647 seems to have introduced a CodeClimate failure, which slipped in here when I rebased on current master. So those commits (084b072 and the two following) have been un-merged from this branch to keep things clean.

vitoravelino commented 5 years ago

PR #647 seems to have introduced a CodeClimate failure, which slipped in here when I rebased on current master. So those commits (084b072 and the two following) have been un-merged from this branch to keep things clean.

IIRC rebasing won't show past commits unless the target branch doesn't have them. What I usually do to rebase is:

# supposing `upstream` is the name of the main repository
$ git fetch upstream
$ git rebase upstream/master
$ git push origin my-branch --force

Hope it helps! :wink:

dannysauer commented 5 years ago

IIRC rebasing won't show past commits unless the target branch doesn't have them. What I usually do to rebase is:

That's precisely what I did. :) The relevant commit was merged into master after this PR was opened, so they weren't originally in my branch. Since they merged (and were rebase -i drop'd) cleanly, I decided to leave them removed from this branch for now to prevent muddying the water re: whether the checks fail because of this changeset or something else. I figure I'll (re)rebase on master at the end when I squash the commits before merge.

vitoravelino commented 5 years ago

When I fill an invalid provider url I get duplicated error message

screenshot-20180928174538-246x156

dannysauer commented 5 years ago

The duplicated error message is because the model was using the http_url validator, and the OIDC connection test also validates that it's a http url first. So it was validating twice, causing the message to be added twice. I saw that in testing but thought I remembered fixing it. Guess not. :) Thanks.

Fixed (in 8e02aa8) by removing the explicit http_url validator on the provider URL's field in the model.

dannysauer commented 5 years ago

So, what remains to get this merged in? Just remove the branch references in the OIDC testing document and collapse all the commits?

vitoravelino commented 5 years ago

A validators folder exists, just realized that the validators created here are not there. Any special reason for that?

I'm still not agreeing with the unnecessary test/validate button workflow and the VCR line wrapping the DatabaseCleaner call.

dannysauer commented 5 years ago

A validators folder exists

Not sure about the validators; the few examples I looked at said that custom database validators go in the concerns folder, so I didn't ask questions. :) Looking at https://www.sitepoint.com/a-quick-study-of-the-rails-directory-structure/ and a few other similar sites, the controllers/concerns directory is well-known for this general sort of purpose (locating common code used by controllers). On the other hand, validators not only isn't something I can find documented outside of one now-deleted blog post, it's located away from the data controllers that use it. I guess the better question might be "why is there a validators directory?" :)

I don't have a huge problem with moving the files, as the name "concerns" doesn't make a lot of sense to me, but it'd be nice if there was some definitive document or similar guidance on the subject.

dannysauer commented 5 years ago

I'm still not agreeing with the unnecessary test/validate button workflow

I'm in agreement with ultimately changing this to only "validate on save", but I'd really rather see that in a separate PR where the LDAP connector is changed the same way, and making that LDAP change concurrently would be outside the scope of this PR.

dannysauer commented 5 years ago

and the VCR line wrapping the DatabaseCleaner call.

I started to change that, but the only way I can see to make that happen is to disable database validation on the factory by default. I decided not to use that solution, because anyone overriding any parameter in a factory call elsewhere would also need to be aware of the "not validated by default" nature. Normally, one would expect the database validation to happen when using a database record-generating factory, I think, so that would be a bad solution.

The other option that I thought of is to explicitly pick the dex_connector_oidc factory out of the list of factories and override how it's called in the cleaner definition. But that, to me, would be quite a bit uglier than the VCR wrapper. I'm very much open to discussing any other solution. I don't really like the way it is, but it's the best I've come up with so far.

vitoravelino commented 5 years ago

I'm in agreement with ultimately changing this to only "validate on save", but I'd really rather see that in a separate PR where the LDAP connector is changed the same way, and making that LDAP change concurrently would be outside the scope of this PR.

So, if I understood correct, you prefer something that is not better just because of something that is already there? It's just a matter of removing stuff at this point, things will become even simpler.

And because of that I'm still in favor of delivering a better solution right away and avoid adding more technical debt.

vitoravelino commented 5 years ago

Not sure about the validators; the few examples I looked at said that custom database validators go in the concerns folder, so I didn't ask questions. :) Looking at https://www.sitepoint.com/a-quick-study-of-the-rails-directory-structure/ and a few other similar sites, the controllers/concerns directory is well-known for this general sort of purpose (locating common code used by controllers). On the other hand, validators not only isn't something I can find documented outside of one now-deleted blog post, it's located away from the data controllers that use it. I guess the better question might be "why is there a validators directory?" :)

Because we want to group all the validators in one place. There are other approaches of grouping but the one we've been using is based on functionality.

I don't have a huge problem with moving the files, as the name "concerns" doesn't make a lot of sense to me, but it'd be nice if there was some definitive document or similar guidance on the subject.

Explanation below extracted from [0]:

All subdirectories of app in the application and engines present at boot time. For example, app/controllers. They do not need to be the default ones, any custom directories like app/workers belong automatically to autoload_paths.

[0] https://guides.rubyonrails.org/autoloading_and_reloading_constants.html#autoload-paths-and-eager-load-paths

dannysauer commented 5 years ago

They do not need to be the default ones, any custom directories

Ok, that information is what I was looking for. Thanks. :) Moved to validators in 628f0b9

dannysauer commented 5 years ago

I'm in agreement with ultimately changing this to only "validate on save", but I'd really rather see that in a separate PR where the LDAP connector is changed the same way, and making that LDAP change concurrently would be outside the scope of this PR.

So, if I understood correct, you prefer something that is not better just because of something that is already there? It's just a matter of removing stuff at this point, things will become even simpler.

In the case of UI/UX, maybe. :) I'm not sure that having the separate validation button is actually not better. There's no disputing that it's technically redundant, but the end user doesn't see that. If the button is simply removed, then an end user who's seen the LDAP connector may wonder if the OIDC connector is also validated. The only way that validation will be communicated is if they enter a bad value and see the failure on the form. If they enter good information and hit save -- which is hopefully the common case -- they can't be sure. Including an explicit, visible validation step removes that doubt.

An alternative which addresses the same concern might be adding a note either in the documentation or on the form itself indicating that the connector will be tested before being saved, and errors will be flagged.

To be fair, I really wouldn't be sad if it's removed nor happy if it's kept; it's minimal work either way. It seems like the main sticking point here is whether or not the redundant button is justified by a desire for consistency. Do we have a user interface expert or team who has a more vested interest in the user experience, and who can contribute how important it actually is to maintain that visual consistency? I think that relative weighting between technical simplicity and user interface is what this discussion is missing.

vitoravelino commented 5 years ago

I started to change that, but the only way I can see to make that happen is to disable database validation on the factory by default. I decided not to use that solution, because anyone overriding any parameter in a factory call elsewhere would also need to be aware of the "not validated by default" nature. Normally, one would expect the database validation to happen when using a database record-generating factory, I think, so that would be a bad solution.

Indeed that wouldn't be a good solution.

My guts says that having a validation that reaches an external service in a model is not good. And what makes me think that is the dependency created whenever the model is used. Disabling the validation on the factory during the creation improved a lot that problem but was not enough IMO.

The LDAP page will need to also solve this problem when we move to the "validate on save" approach. Maybe we could address this issue when LDAP "validate on save" is done, as long as the solution is different from the one proposed in this PR unless we find an alternative in the meantime.

dannysauer commented 5 years ago

My guts says that having a validation that reaches an external service in a model is not good. And what makes me think that is the dependency created whenever the model is used.

In this particular case, it's the database structure and the larger system which I think justifies it. As-is, any value in the database gets provided to Dex during orchestration, and Dex verifies by actually connecting on startup. If the issuer doesn't match what the external OIDC Provider returns (or if the OP isn't reachable), Dex fails to start, and orchestration fails. It's been suggested that we could alter the database structure to add a "validated" field which indicates whether or not a connector has been verified (or maybe when it was validated) and is thus safe to deploy. At that point, it would be safe to reduce the validation to only format checks, and the "second-stage validation" could make the external calls.

If this second-stage validation was initiated when orchestration happened, all connectors could be revalidated each time. That would also help ensure that Dex didn't fail to restart if something changes between initial validation and a later reorchestration. While I don't love the idea of potentially "bad" data in the database, I do like the potential increase in robustness that would provide.

Moving the candidate connectors to a separate table and replacing the current tables with a read-only view showing only "validated" data would allow this to be mostly transparent to the data consumers, and would allow the second-stage validation to happen entirely in Velum. I think.

The LDAP page will need to also solve this problem when we move to the "validate on save" approach. Maybe we could address this issue when LDAP "validate on save" is done, as long as the solution is different from the one proposed in this PR unless we find an alternative in the meantime.

Yes, it would make sense to combine the back-end validation rework with the front-end interface appearance, since they're fairly interdependent changes.

dannysauer commented 5 years ago

Aside from:

Are we in good shape now?

dannysauer commented 5 years ago

Rebased on current master yesterday.

Walked through the full config again. Build and auto-bootstrap using KVM build 454, then follow the OIDC testing instructions to set up a test dex and log in with that OIDC provider by visiting /oidc on the admin node. This results in the kubeconfig download page. I did not use kubectl with that configuration file, but the contents look reasonable.

suseclee commented 5 years ago

I verified this velum code with salt (https://github.com/dannysauer/salt.git) and updated master automation repo (11/5). It worked as it is expected.

After I created ,surely using kubconfig from admin at first ,clusterrolebinding such as "kubectl create clusterrolebinding oidc-test --namespace=default --clusterrole=edit --user=", I could use kubectl from the oidc user for the default namespace

Great work!!! David and Danny.

Everything looks good to me.