ory / hydra

The most scalable and customizable OpenID Certified™ OpenID Connect and OAuth Provider on the market. Become an OpenID Connect and OAuth2 Provider over night. Broad support for related RFCs. Written in Go, cloud native, headless, API-first. Available as a service on Ory Network and for self-hosters.
https://www.ory.sh/?utm_source=github&utm_medium=banner&utm_campaign=hydra
Apache License 2.0
15.59k stars 1.5k forks source link

sql: Scan error on column index 13, name \"login_challenge\": unsupported Scan, storing driver.Value type <nil> into type *string #1240

Closed es-lab closed 5 years ago

es-lab commented 5 years ago

Describe the bug OAuth flow breaks after a while. Currently unknown how to reproduce. But for some reason it breaks for a user & client specifically. What happens visually is:

To Reproduce Still do not know.

Expected behavior To not break.

Screenshots Logs of hydra of a complete OAuth flow:

time="2018-12-15T10:14:48Z" level=info msg="started handling request" method=GET remote=remote_ip_here request="/oauth2/auth?client_id=client_id_here&scope=offline&response_type=code&redirect_uri=https://client_host_here/oauth/redirect&state=23b00222-ef52-40df-aa07-4dc19330ebe0&ui_locales=en"

time="2018-12-15T10:14:48Z" level=info msg="completed handling request" measure#https://hydra_host_here/.latency=5613845 method=GET remote=remote_ip_here request="/oauth2/auth?client_id=client_id_here&scope=offline&response_type=code&redirect_uri=https://client_host_here/oauth/redirect&state=23b00222-ef52-40df-aa07-4dc19330ebe0&ui_locales=en" 
status=302 text_status=Found took=5.613845ms

time="2018-12-15T10:14:48Z" level=info msg="started handling request" method=GET remote="172.18.0.13:43830" request=/oauth2/auth/requests/login/cf015261732444469bc3670643235896

time="2018-12-15T10:14:48Z" level=info msg="completed handling request" measure#https://hydra_host_here/.latency=4867655 method=GET remote="172.18.0.13:43830" request=/oauth2/auth/requests/login/cf015261732444469bc3670643235896 status=200 text_status=OK took=4.867655ms

time="2018-12-15T10:15:03Z" level=info msg="started handling request" method=GET remote="172.18.0.13:43830" request=/oauth2/auth/requests/login/cf015261732444469bc3670643235896

time="2018-12-15T10:15:03Z" level=info msg="completed handling request" measure#https://hydra_host_here/.latency=4808050 method=GET remote="172.18.0.13:43830" request=/oauth2/auth/requests/login/cf015261732444469bc3670643235896 status=200 text_status=OK took=4.80805ms

time="2018-12-15T10:15:03Z" level=info msg="started handling request" method=PUT remote="172.18.0.13:43830" request=/oauth2/auth/requests/login/cf015261732444469bc3670643235896/accept

time="2018-12-15T10:15:03Z" level=info msg="completed handling request" measure#https://hydra_host_here/.latency=6392958 method=PUT remote="172.18.0.13:43830" request=/oauth2/auth/requests/login/cf015261732444469bc3670643235896/accept status=200 text_status=OK took=6.392958ms

time="2018-12-15T10:15:03Z" level=info msg="started handling request" method=GET remote=remote_ip_here request="/oauth2/auth?client_id=client_id_here&login_verifier=797c33b6f64b452facead47aca386127&redirect_uri=https%3A%2F%2Fclient_host_here%2Foauth%2Fredirect&response_type=code&scope=offline&state=23b00222-ef52-40df-aa07-4dc19330ebe0&ui_locales=en"

time="2018-12-15T10:15:03Z" level=error msg="An error occurred" error="sql: Scan error on column index 13, name \"login_challenge\": unsupported Scan, storing driver.Value type <nil> into type *string"

time="2018-12-15T10:15:03Z" level=info msg="completed handling request" measure#https://hydra_host_here/.latency=14863933 method=GET remote=remote_ip_here request="/oauth2/auth?client_id=client_id_here&login_verifier=797c33b6f64b452facead47aca386127&redirect_uri=https%3A%2F%2Fclient_host_here%2Foauth%2Fredirect&response_type=code&scope=offline&state=23b00222-ef52-40df-aa07-4dc19330ebe0&ui_locales=en" status=302 text_status=Found took=14.863933ms

Version:

Additional context

Thought that the problem might be that I did not handled correctly 409 conflicts of login challenges, but even after proper logging and errors handling I still get the issue.

Tried to delete the data from tables with tokens and all the others excluding clients and migrations tables, everything started working properly, and then again it did break for some user/client. For the same user but other clients it still works.

To mention that the clients are first party apps that do not request audiences and scopes (it does request offline scope though), the consent logic authorizes the client with all its pre-defined scopes and audiences. May this be the problem? Do I need to allow to the client just its requested scopes and audiences?

Client details

{
    "audience": [
        "some_audience_here"
    ],
    "client_id": "client_id_here",
    "grant_types": [
        "authorization_code",
        "refresh_token"
    ],
    "jwks": {},
    "redirect_uris": [
        "https://client_host_here/oauth/redirect",
        "http://another_client_host_here/oauth/redirect"
    ],
    "response_types": [
        "code"
    ],
    "scope": "offline oauth_auto_consent",
    "subject_type": "public",
    "token_endpoint_auth_method": "none",
    "userinfo_signed_response_alg": "none"
}

oauth_auto_consent is used by the consent provider to check if it should skip consent and allow automatically for some clients. In this case it allows automatically for this clients. But the same happens for other clients and it works perfectly.

THIS NEEDS A SOLUTION ASAP

aeneasr commented 5 years ago

Did you upgrade recently? Maybe missed SQL migrations?

On 16. Dec 2018, at 23:22, es-lab notifications@github.com wrote:

Describe the bug OAuth flow breaks after a while. Currently unknown how to reproduce. But for some reason it breaks for a user & client specifically. What happens visually is:

the client requests authorization hydra redirects on login provider login provider gets skip true and accepts login request and redirect on redirectTo value received from hydra on accessing the link to hydra it gets the error *sql: Scan error on column index 13, name "login_challenge": unsupported Scan, storing driver.Value type into type string and redirects to client with error The error is unrecognizable To Reproduce Still do not know.

Expected behavior To not break.

Screenshots Logs of hydra of a complete OAuth flow:

time="2018-12-15T10:14:48Z" level=info msg="started handling request" method=GET remote=remote_ip_here request="/oauth2/auth?client_id=client_id_here&scope=offline&response_type=code&redirect_uri=https://client_host_here/oauth/redirect&state=23b00222-ef52-40df-aa07-4dc19330ebe0&ui_locales=en"

time="2018-12-15T10:14:48Z" level=info msg="completed handling request" measure#https://hydra_host_here/.latency=5613845 method=GET remote=remote_ip_here request="/oauth2/auth?client_id=client_id_here&scope=offline&response_type=code&redirect_uri=https://client_host_here/oauth/redirect&state=23b00222-ef52-40df-aa07-4dc19330ebe0&ui_locales=en" status=302 text_status=Found took=5.613845ms

time="2018-12-15T10:14:48Z" level=info msg="started handling request" method=GET remote="172.18.0.13:43830" request=/oauth2/auth/requests/login/cf015261732444469bc3670643235896

time="2018-12-15T10:14:48Z" level=info msg="completed handling request" measure#https://hydra_host_here/.latency=4867655 method=GET remote="172.18.0.13:43830" request=/oauth2/auth/requests/login/cf015261732444469bc3670643235896 status=200 text_status=OK took=4.867655ms

time="2018-12-15T10:15:03Z" level=info msg="started handling request" method=GET remote="172.18.0.13:43830" request=/oauth2/auth/requests/login/cf015261732444469bc3670643235896

time="2018-12-15T10:15:03Z" level=info msg="completed handling request" measure#https://hydra_host_here/.latency=4808050 method=GET remote="172.18.0.13:43830" request=/oauth2/auth/requests/login/cf015261732444469bc3670643235896 status=200 text_status=OK took=4.80805ms

time="2018-12-15T10:15:03Z" level=info msg="started handling request" method=PUT remote="172.18.0.13:43830" request=/oauth2/auth/requests/login/cf015261732444469bc3670643235896/accept

time="2018-12-15T10:15:03Z" level=info msg="completed handling request" measure#https://hydra_host_here/.latency=6392958 method=PUT remote="172.18.0.13:43830" request=/oauth2/auth/requests/login/cf015261732444469bc3670643235896/accept status=200 text_status=OK took=6.392958ms

time="2018-12-15T10:15:03Z" level=info msg="started handling request" method=GET remote=remote_ip_here request="/oauth2/auth?client_id=client_id_here&login_verifier=797c33b6f64b452facead47aca386127&redirect_uri=https%3A%2F%2Fclient_host_here%2Foauth%2Fredirect&response_type=code&scope=offline&state=23b00222-ef52-40df-aa07-4dc19330ebe0&ui_locales=en"

time="2018-12-15T10:15:03Z" level=error msg="An error occurred" error="sql: Scan error on column index 13, name \"login_challenge\": unsupported Scan, storing driver.Value type into type *string"

time="2018-12-15T10:15:03Z" level=info msg="completed handling request" measure#https://hydra_host_here/.latency=14863933 method=GET remote=remote_ip_here request="/oauth2/auth?client_id=client_id_here&login_verifier=797c33b6f64b452facead47aca386127&redirect_uri=https%3A%2F%2Fclient_host_here%2Foauth%2Fredirect&response_type=code&scope=offline&state=23b00222-ef52-40df-aa07-4dc19330ebe0&ui_locales=en" status=302 text_status=Found took=14.863933ms Version:

Environment: Docker Compose Version oryd/hydra:v1.0.0-rc.5_oryOS.10 (happend on oryd/hydra:v1.0.0-rc.2_oryOS.9 too) Access Token type: jwt Additional context

Thought that the problem might be that I did not handled correctly 409 conflicts of login challenges, but even after proper logging and errors handling I still get the issue.

Tried to delete the data from tables with tokens and all the others excluding clients and migrations tables, everything started working properly, and then again it did break for some user/client. For the same user but other clients it still works.

To mention that the clients are first party apps that do not request audiences and scopes (it does request offline scope though), the consent logic authorizes the client with all its pre-defined scopes and audiences. May this be the problem? Do I need to allow to the client just its requested scopes and audiences?

Client details

{ "audience": [ "some_audience_here" ], "client_id": "client_id_here", "grant_types": [ "authorization_code", "refresh_token" ], "jwks": {}, "redirect_uris": [ "https://client_host_here/oauth/redirect", "http://another_client_host_here/oauth/redirect" ], "response_types": [ "code" ], "scope": "offline oauth_auto_consent", "subject_type": "public", "token_endpoint_auth_method": "none", "userinfo_signed_response_alg": "none" } oauth_auto_consent is used by the consent provider to check if it should skip consent and allow automatically for some clients. In this case it allows automatically for this clients. But the same happens for other clients and it works perfectly.

THIS NEEDS A SOLUTION ASAP

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

es-lab commented 5 years ago

@aeneasr The last 3 weeks it was just integration of hydra in an existing system. I started with version oryd/hydra:v1.0.0-rc.2_oryOS.9 initially, sql migration did ran successfully from the first time.

Then came a lot of clients create/update/delete/recreate because of development/prototyping of different use-cases.

There was also a change to the Access Token type from the time I started. Could this change the way migrations run, and that now that it's jwt it expects some different schema/indexes?

Also, I still can not reproduce this on my localhost, the current issues happens on a staging (https) environment.

I did the upgrade to oryd/hydra:v1.0.0-rc.5_oryOS.10 3 days ago, just though it may solve the issue, but it did not. SQL migrations did run successfully though.

What I did not try to do is a clean install of hydra db. I will try this one and hope to come back with more useful info.

aeneasr commented 5 years ago

What database are you using? Exact version please.

The JWT strategy is, as noted, in beta and not the primary/recommended way, but it should obviously not break stuff.

The error is proper weird because is actually assignable to string. That’s the whole point of nil 🤔

On 17. Dec 2018, at 07:43, es-lab notifications@github.com wrote:

@aeneasr The last 3 weeks it was just integration of hydra in an existing system. I started with version oryd/hydra:v1.0.0-rc.2_oryOS.9 initially, sql migration did ran successfully from the first time.

Then came a lot of clients create/update/delete/recreate because of development/prototyping of different use-cases.

There was also a change to the Access Token type from the time I started. Could this change the way migrations run, and that now that it's jwt it expects some different schema/indexes?

Also, I still can not reproduce this on my localhost, the current issues happens on a staging (https) environment.

I did the upgrade to oryd/hydra:v1.0.0-rc.5_oryOS.10 3 days ago, just though it may solve the issue, but it did not. SQL migrations did run successfully though.

What I did not try to do is a clean install of hydra. I will try this one and hope to come back with more useful info.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

es-lab commented 5 years ago

I'm using postgres 9.6

es-lab commented 5 years ago

I've just done a clean db installation. Here are the outputs:

Migration output

Applying `jwk` SQL migrations...
Applied 4 `jwk` SQL migrations.
Applying `client` SQL migrations...
Applied 12 `client` SQL migrations.
Applying `consent` SQL migrations...
Applied 7 `consent` SQL migrations.
Applying `oauth2` SQL migrations...
Applied 9 `oauth2` SQL migrations.
Migration successful! Applied a total of 32 SQL migrations.
Migration successful!

First deploy after migration output

Thank you for using ORY Hydra v1.0.0-rc.5+oryOS.10!

Take security seriously and subscribe to the ORY Security Newsletter. Stay on top of new patches and security insights.                                                                                                

>> Subscribe now: http://eepurl.com/di390P <<
time="2018-12-17T09:13:10Z" level=info msg="Setting up tracing middleware"
time="2018-12-17T09:13:10Z" level=info msg="No tracer configured - skipping tracing setup"
time="2018-12-17T09:13:10Z" level=info msg="Connecting with postgres://*:*@iam_oauth_db:5432/hydra?sslmode=disable"
time="2018-12-17T09:13:10Z" level=info msg="Connected to SQL!"
time="2018-12-17T09:13:10Z" level=info msg="JSON Web Key Set hydra.openid.id-token does not exist yet, generating new key pair..."
time="2018-12-17T09:13:11Z" level=info msg="JSON Web Key Set hydra.jwt.access-token does not exist yet, generating new key pair..."
time="2018-12-17T09:13:13Z" level=info msg="Enabled CORS"
time="2018-12-17T09:13:13Z" level=warning msg="Could not parse login and consent request lifespan value (). Defaulting to 15m"
time="2018-12-17T09:13:13Z" level=info msg="Enabled CORS"
time="2018-12-17T09:13:13Z" level=info msg="Setting up Prometheus middleware"
time="2018-12-17T09:13:13Z" level=info msg="JSON Web Key Set hydra.https-tls does not exist yet, generating new key pair..."
time="2018-12-17T09:13:21Z" level=info msg="Enabled CORS"
time="2018-12-17T09:13:21Z" level=info msg="Setting up http server on :4445"
time="2018-12-17T09:13:21Z" level=warning msg="HTTPS disabled. Never do this in production."
time="2018-12-17T09:13:21Z" level=info msg="Setting up http server on :4444"
time="2018-12-17T09:13:21Z" level=warning msg="HTTPS disabled. Never do this in production."

Command used to create clients

hydra clients create --id client_id_here --grant-types authorization_code,refresh_token --response-types code --endpoint http://localhost:4445 --scope offline,oauth_auto_consent --callbacks https://client_host_here/oauth/redirect,http://another_client_host_here/oauth/redirect --token-endpoint-auth-method none --audience some_audience_here

It works now, but as I said, I don't know how to reproduce and don't know if/when it will reappear now.

es-lab commented 5 years ago

@aeneasr I've reproduced it!! Finally!!

Steps to reproduce: 1) Create 2 clients (public clients (SPAs) in my case):

7) Now go to clientY page, you will be redirected on hydra/oauth2/auth?client_id=clientY&grant_type=code&... 8) Hydra will redirect on login, but now the login request has skip=true, and it gets accepted automatically, here the login challenge is added to the database with a reference to the login session id 9) Then the user is redirected by hydra to the consent provider and the consent request is accepted automatically with remember true (as these are first party apps), to note here that the consent is added into the database with the reference to login session id 10) Now the user is redirected back to the client with the authorization code and everything is fine

11) Now when the user logs out: here the user is redirected to hydra/oauth2/auth/sessions/login/revoke to clean the hydra session from the browser, the login session is removed from the database, and with it also disappears the login request that had referenced the login session, now the interesting part is that the consent that had referenced the login session id is not deleted, but the column with login challenge is NULL as its login challenge was deleted. 12) The user tries to login on clientY again 13) the login request is accepted, but, when redirecting to hydra the database query fails because the login_challenge is NULL and it can not scan a NULL value form postgres.

Now this user can never authorize the clientY again. It gets redirected back to the client with /oauth/redirect?error=error&error_description=The+error+is+unrecognizable.

If you delete the consent for the user/clientY from the table the client will be authorized successfully and it will work fine.

es-lab commented 5 years ago

@aeneasr I think there are some other ways to get to the issue, but I've found these way to get to it consistently, I've also noticed that this does not happen if in the database already exists a consent for the user/client, it somehow does not query the one with the NULL login_challenge and it does not fail.

aeneasr commented 5 years ago

Thank you for reproducing this! I’ll push & publish a fix tomorrow

On 17. Dec 2018, at 18:36, es-lab notifications@github.com wrote:

@aeneasr I think there are some other ways to get to the issue, but I've found these way to get to it consistently, I've also noticed that this does not happen if in the database already exists a consent for the user/client, it somehow does not query the one with the NULL login_challenge and it does not fail.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

aeneasr commented 5 years ago

This should work now, could you also check to make sure? :)

aeneasr commented 5 years ago

Currently unreleased, on master, if you need help with that LMK

es-lab commented 5 years ago

@aeneasr It works! Thank you! I'll stay with the image built & pushed into my repo for now. Will this be released with the RC6 or the 1.0.0?

aeneasr commented 5 years ago

Yes, this will be rc.6, I won't be able to release it today as there are a couple more things that need to be solved, but it should be out by latest end of week.