OSC / ondemand

Supercomputing. Seamlessly. Open, Interactive HPC Via the Web
https://openondemand.org/
MIT License
295 stars 107 forks source link

Default auth config in 1.8 causes Apache 500 error #649

Closed whorka closed 4 years ago

whorka commented 4 years ago

Hi,

I noticed that the default authentication configuration in 1.8 was changed from Basic to "Dex" (openid-connect) in https://github.com/OSC/ondemand/commit/5915cbbb8659cb188da80cb5d54551a8ca9944f7 (and the corresponding config file comments were updated in https://github.com/OSC/ondemand/commit/a698fac0bf4bcc984348aa77814f913bc2ec6a31).

As a result, the portal now throws an HTTP 500 error in the default configuration, and logs:

[Mon Aug 17 15:00:31.380929 2020] [authn_core:error] [pid 16] [client 172.20.0.1:42288] AH01796: AuthType openid-connect configured without corresponding module

For example, when using the OOD Docker image source, which installs the OOD packages from the ondemand-release-web-latest repo (currently 1.8.11):

docker system prune -a
docker volume prune
cd ~/projects # or wherever
git clone https://github.com/OSC/ood-images.git ood-images-official
cd ood-images-official/docker-with-slurm
docker-compose up -d --build

Going to http://localhost:8080 shows "Internal Server Error".

I was able to reproduce this error using Open Ondemand packages for 1.8.11 from the "latest" yum repo, and earlier 1.8 packages. Only downgrading to 1.7 provided a working default configuration.

Also, of secondary concern, the update in https://github.com/OSC/ondemand/commit/a698fac0bf4bcc984348aa77814f913bc2ec6a31 was incomplete. The comment above the modified line still says that the default auth type is Basic.

msquee commented 4 years ago

@whorka Enter the ondemand Docker container and run systemctl start ondemand-dex

whorka commented 4 years ago

Hello @msquee ,

Thank you for the quick reply.

Unfortunately, the Docker container is missing sufficient privileges to complete the systemctl start ondemand-dex operation, and also appears to be missing the ondemand-dex service completely:

$ docker-compose exec ood /bin/bash 
[root@ood /]# systemctl start ondemand-dex
Failed to get D-Bus connection: Operation not permitted
[root@ood /]# systemctl list-unit-files |fgrep ondemand
[root@ood /]# exit

I tried installing it and starting it manually, but there was no change.

[root@ood /]# yum -y install ondemand-dex
[root@ood /]# systemctl list-unit-files |fgrep ondemand
ondemand-dex.service                   disabled
[root@ood /]# systemctl enable ondemand-dex
Created symlink /etc/systemd/system/multi-user.target.wants/ondemand-dex.service, pointing to /usr/lib/systemd/system/ondemand-dex.service.
[root@ood /]# systemctl list-unit-files |fgrep ondemand
ondemand-dex.service                   enabled 
[root@ood /]# systemctl start ondemand-dex
Failed to get D-Bus connection: Operation not permitted
[root@ood /]# fgrep ExecStart /usr/lib/systemd/system/ondemand-dex.service
ExecStart=/usr/sbin/ondemand-dex serve /etc/ood/dex/config.yaml
[root@ood /]# /usr/sbin/ondemand-dex serve /etc/ood/dex/config.yaml &
[1] 6992
[root@ood /]# time="2020-08-17T16:38:44Z" level=info msg="config issuer: http://ood:5556"
time="2020-08-17T16:38:44Z" level=info msg="config storage: sqlite3"
time="2020-08-17T16:38:44Z" level=info msg="config static client: OnDemand"
time="2020-08-17T16:38:44Z" level=info msg="config connector: local passwords enabled"
time="2020-08-17T16:38:44Z" level=info msg="config skipping approval screen"
time="2020-08-17T16:38:44Z" level=info msg="listening (http/telemetry) on 0.0.0.0:5558"
time="2020-08-17T16:38:44Z" level=info msg="listening (http) on 0.0.0.0:5556"
[root@ood /]# ps -ef |fgrep dex
root      6992  6972  0 16:38 pts/0    00:00:00 /usr/sbin/ondemand-dex serve /etc/ood/dex/config.yaml

I also did a kill -HUP on the parent httpd process. Now the error is the same (HTTP 500 Internal Server Error), but the log message is different:

[Mon Aug 17 16:42:46.696567 2020] [auth_openidc:error] [pid 7098] [client 172.21.0.1:51988] oidc_authenticate_user: the URL hostname (ood) of the configured OIDCRedirectURI does not match the URL hostname of the URL being accessed (localhost): the "state" and "session" cookies will not be shared between the two!

What else would you advise?

msquee commented 4 years ago

@whorka Can you run cat /var/log/httpd24/localhost_error.log and paste the output?

whorka commented 4 years ago

https://gist.github.com/whorka/1ed39240a939e0f6f4c9a9206f8d434a

msquee commented 4 years ago
[Mon Aug 17 16:38:39.808057 2020] [auth_openidc:error] [pid 6903] [client 172.21.0.1:51944] oidc_util_http_call: curl_easy_perform() failed on: http://ood:5556/.well-known/openid-configuration (Failed connect to ood:5556; Connection refused)
[Mon Aug 17 16:38:39.808161 2020] [auth_openidc:error] [pid 6903] [client 172.21.0.1:51944] oidc_provider_static_config: could not retrieve metadata from url: http://ood:5556/.well-known/openid-configuration

Right here is the error, ondemand-dex isn't started.

I've had this error Failed to get D-Bus connection: Operation not permitted with Docker too, I'll dig around for a solution and reply back.

treydock commented 4 years ago

use the dex branch of ood-images: https://github.com/OSC/ood-images/tree/dex. That has the fixes for Dex and I just pushed a fix for Docker to start the necessary services. You can't use systemd in docker without PID 1 being init from systemd so instead you have to use the actual start command, not systemd.

treydock commented 4 years ago

When using Docker the name outside the container has to match name inside the container. Thus you have to set localhost for servername in ood_portal.yml to get proper OIDC working with redirects. There would have to be many overrides done to make it work without localhost.

whorka commented 4 years ago

Thanks for the tip about the dex branch of ood-images, @treydock ! I am able to get to the login page using that branch, but I am still getting an HTTP 500 error after I enter the login credentials documented in the updated README.md. The error messages in /var/log/httpd24/localhost_error.log are:

[Mon Aug 17 18:34:16.449766 2020] [auth_openidc:warn] [pid 15] oidc_check_config_openid_openidc: the URL scheme (http) of the configured OIDCProviderMetadataURL SHOULD be "https" for security reasons!
[Mon Aug 17 18:34:16.449788 2020] [auth_openidc:warn] [pid 15] oidc_check_config_openid_openidc: the URL scheme (http) of the configured OIDCRedirectURI SHOULD be "https" for security reasons (moreover: some Providers may reject non-HTTPS URLs)
[Mon Aug 17 18:37:00.369212 2020] [auth_openidc:error] [pid 37] [client 172.22.0.1:58190] oidc_restore_proto_state: no "mod_auth_openidc_state_g0PdIli84RlUGZNPc6f8j3u_jNE" state cookie found, referer: http://localhost:5556/
[Mon Aug 17 18:37:00.369331 2020] [auth_openidc:error] [pid 37] [client 172.22.0.1:58190] oidc_unsolicited_proto_state: could not parse JWT from state: invalid unsolicited response: [src/jose.c:755: oidc_jwt_parse]: cjose_jws_import failed: invalid argument [file: jws.c, function: cjose_jws_import, line: 781], referer: http://localhost:5556/
[Mon Aug 17 18:37:00.369352 2020] [auth_openidc:error] [pid 37] [client 172.22.0.1:58190] oidc_authorization_response_match_state: unable to restore state, referer: http://localhost:5556/
[Mon Aug 17 18:37:00.369366 2020] [auth_openidc:error] [pid 37] [client 172.22.0.1:58190] oidc_handle_authorization_response: invalid authorization response state and no default SSO URL is set, sending an error..., referer: http://localhost:5556/
treydock commented 4 years ago

Are you using Chrome? Try a different browser. The error you're getting is because there is a state cookie in your cache that is invalid and causing problems. I normally use Chrome for my day-to-day and when I saw this error I just switched to Firefox for this particular access and things worked fine. I've not yet had a chance to investigate if this is an issue with Chrome or just an issue with conflicting cache/cookie data that is fixed by clearing all cookie/cache data.

treydock commented 4 years ago

Also the password is actually going to be password for the ood@localhost user, I need to update README for docker.

whorka commented 4 years ago

Login works in Firefox and Safari, but it is persistently broken in Chromium, even after clearing cookies/cache and in Incognito sessions.

Do you want me to open a new issue for this? I realize it has strayed a bit from the original topic of the default auth config in 1.8.

treydock commented 4 years ago

A new issue is fine. I can reproduce on my end and am experimenting with solutions. We've deployed the Dex auth backed by OSC LDAP to a VM at OSC and do not have issues with Chrome. It seems maybe something with either localhost or Docker or configs specific to ood-images are causing problems with Chrome.

whorka commented 4 years ago

I opened https://github.com/OSC/ood-images/issues/18 to address the login bug in Chrome when running OOD in Docker.

Regarding this current issue: I initially thought the problem was in the update_ood_portal script due to the misleading comment in ood_portal_example.yml. Could this please be updated so that it correctly states that the default authentication type is "openid-connect" and not "basic"?

Thank you for helping to get the Docker image working again!

treydock commented 4 years ago

Comment updated in #650, if that doesn't help let me know.

whorka commented 4 years ago

LGTM. Thanks!