PhilanthropyDataCommons / auth

PDC related extensions that were made for the keycloak auth service
1 stars 1 forks source link

Two-factor SMS authentication with keycloak and twilio #2

Closed bickelj closed 1 year ago

bickelj commented 1 year ago

To provide a second factor of authentication via SMS, we'll find or write a keycloak provider/extension to send a one-time passcode via SMS to the authenticating user's mobile phone.

The workflow will begin with a keycloak authentication attempt and our extension will call twilio.

Tasks involved:

    • [X] Send an SMS with Twilio from a small Java program.
    • [X] Create a Java jar that should log a message from within keycloak the way keycloak expects, using keycloak APIs.
    • [X] Configure keycloak to use the jar.
    • [X] Set up a local means to test the above keycloak configuration.
    • [x] Integrate the Twilio program into the keycloak extension (i.e. merge the results from steps 1 and 2) into the basic authentication flow.
    • [x] Create the gradle build script for the keycloak theme jar.
    • [x] Integrate the jar into keycloak delivery, e.g. new docker image, copy the jar, .env file and README in deploy project.
    • [ ] Document exact steps to create the pdc realm, configuration of SMS OTP extension, clients, etc., or figure out how to export and import (automatically or from source) the configuration after doing it locally.
    • [x] Actual deployment of the service, auth with extension to production.
    • [ ] Allow use of the same keycloak extension from (5) in the password reset flow (optional for now).

There may be more tasks as well.

kfogel commented 1 year ago

See also PR #124 in the service repository.

kfogel commented 1 year ago

@jim-mcgowan asked some questions, to which @bickelj and I now have answers:

Q: Do I create my own user account, or is it created for me?

A: The PDC team creates accounts, at least for now, and we do it in Keycloak, since Keycloak is our user management system. Keycloak is also where we add the phone numbers that Twilio will send texts to.

Q: If a user forgets their password, what do they do? Do they email the PDC team for help, or can the system help them?

A: The system can walk them through a password reset. They click "I forgot my password", and this causes two things to happen: an email with a one-time password reset link gets sent to the email address they supply (which must match the email address we have on record), and a text message with a code gets sent to their phone number of record. They go to the one-time link, and in order to reset their password there they of course must enter the code that was texted to them.

jasonaowen commented 1 year ago

I'm not sure if folks have already found this resource, but I found the blog post Two-Factor Authentication with SMS in Keycloak and the associated dasniko/keycloak-2fa-sms-authenticator repo to be quite helpful.

bickelj commented 1 year ago

@jasonaowen My first pass is using dasniko's work, yes!

kfogel commented 1 year ago

W00t public resources FTW!

bickelj commented 1 year ago

To test (step 4 above) I am using a static single page that uses keycloak.js from https://www.keycloak.org/docs/latest/securing_apps/index.html#_javascript_adapter. I am serving it via the nginx container at a /ui path, which requires putting the static content in a directory that nginx owns and mounting it via docker volume for that container in the compose.yml. Again, this is for local development and testing. We don't have to use the keycloak.js library nor serve static content from our reverse proxy container when we go to production. This seemed like the straightforwardest way to (locally) run the full login workflow with keycloak.

Test html code ```html Keycloak Browser Flow

Keycloak testing

```

The next step is to get the jar to log a message successfully during a browser login (using that test flow).

bickelj commented 1 year ago

Test succeeded (finally). To get it to succeed, there are several pre-requisites:

  1. The library (jar) implementing the Authenticator SPI in keycloak's /providers directory (I used env var plus a docker volume to do this).
  2. Dasniko's (or our own) library implementing the Required Action SPI in keycloak's /providers directory (env var plus docker volume again).
  3. Create a copy of the browser flow that uses the library from (1) in the realm authentication flows configuration of keycloak.
  4. Enable that new browser flow instead of the existing browser flow.
  5. Enable the Required Action from (2) in the realm authentication required actions configuration of keycloak. This last step was the non-obvious one to me.

I see the expected SMS template and required OTP entry after setting up a mobile number on a first authentication flow. On that first login where the user enters an SMS it does not appear to require the OTP via SMS. See https://github.com/dasniko/keycloak-2fa-sms-authenticator/discussions/29 for explanation.

Regardless, the next step is to use code that calls the SMS API in place of a logger message.

Code is in place but the real next step is to get Twilio and its dependencies on keycloak's classpath too.

bickelj commented 1 year ago

I tried a shaded jar using the gradle shadow plugin with some excludes for the jars I already see on keycloak's classpath and that seemed to work. To be a little less brittle might require relocations as well. Next step is to figure out relocations with the shadow plugin.

bickelj commented 1 year ago

Progress commits can be seen in #3.

bickelj commented 1 year ago

A first attempt at relocation of packages in the fat/shaded jar reveals a problem:

XMLInputFactory: Provider com.ctc.wstx.stax.WstxInputFactory not found ``` ERROR: Failed to run 'build' command. ERROR: io.quarkus.builder.BuildException: Build failure: Build failed due to errors [error]: Build step io.quarkus.narayana.jta.deployment.NarayanaJtaProcessor#build threw an exception: javax.xml.stream.FactoryConfigurationError: Provider for class javax.xml.stream.XMLInputFactory cannot be created at java.xml/javax.xml.stream.FactoryFinder.findServiceProvider(FactoryFinder.java:366) at java.xml/javax.xml.stream.FactoryFinder.find(FactoryFinder.java:309) at java.xml/javax.xml.stream.FactoryFinder.find(FactoryFinder.java:222) at java.xml/javax.xml.stream.XMLInputFactory.newInstance(XMLInputFactory.java:161) at com.arjuna.common.util.propertyservice.PropertiesFactoryStax.loadFromXML(PropertiesFactoryStax.java:46) at com.arjuna.common.util.propertyservice.AbstractPropertiesFactory.loadFromFile(AbstractPropertiesFactory.java:150) at com.arjuna.common.util.propertyservice.AbstractPropertiesFactory.getPropertiesFromFile(AbstractPropertiesFactory.java:102) at com.arjuna.common.util.propertyservice.AbstractPropertiesFactory.initDefaultProperties(AbstractPropertiesFactory.java:196) at com.arjuna.common.util.propertyservice.AbstractPropertiesFactory.getDefaultProperties(AbstractPropertiesFactory.java:62) at com.arjuna.common.util.propertyservice.PropertiesFactory.getDefaultProperties(PropertiesFactory.java:48) at io.quarkus.narayana.jta.deployment.NarayanaJtaProcessor.build(NarayanaJtaProcessor.java:126) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at io.quarkus.deployment.ExtensionLoader$3.execute(ExtensionLoader.java:909) at io.quarkus.builder.BuildContext.run(BuildContext.java:281) at org.jboss.threads.ContextHandler$1.runWith(ContextHandler.java:18) at org.jboss.threads.EnhancedQueueExecutor$Task.run(EnhancedQueueExecutor.java:2449) at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1478) at java.base/java.lang.Thread.run(Thread.java:829) at org.jboss.threads.JBossThread.run(JBossThread.java:501) Caused by: java.lang.RuntimeException: Provider for class javax.xml.stream.XMLInputFactory cannot be created at java.xml/javax.xml.stream.FactoryFinder.findServiceProvider(FactoryFinder.java:363) ... 21 more Caused by: java.util.ServiceConfigurationError: javax.xml.stream.XMLInputFactory: Provider com.ctc.wstx.stax.WstxInputFactory not found at java.base/java.util.ServiceLoader.fail(ServiceLoader.java:589) at java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.nextProviderClass(ServiceLoader.java:1212) at java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.hasNextService(ServiceLoader.java:1221) at java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.hasNext(ServiceLoader.java:1265) at java.base/java.util.ServiceLoader$2.hasNext(ServiceLoader.java:1300) at java.base/java.util.ServiceLoader$3.hasNext(ServiceLoader.java:1385) at java.xml/javax.xml.stream.FactoryFinder$1.run(FactoryFinder.java:348) at java.base/java.security.AccessController.doPrivileged(Native Method) at java.xml/javax.xml.stream.FactoryFinder.findServiceProvider(FactoryFinder.java:337) ... 21 more ERROR: Build failure: Build failed due to errors [error]: Build step io.quarkus.narayana.jta.deployment.NarayanaJtaProcessor#build threw an exception: javax.xml.stream.FactoryConfigurationError: Provider for class javax.xml.stream.XMLInputFactory cannot be created at java.xml/javax.xml.stream.FactoryFinder.findServiceProvider(FactoryFinder.java:366) at java.xml/javax.xml.stream.FactoryFinder.find(FactoryFinder.java:309) at java.xml/javax.xml.stream.FactoryFinder.find(FactoryFinder.java:222) at java.xml/javax.xml.stream.XMLInputFactory.newInstance(XMLInputFactory.java:161) at com.arjuna.common.util.propertyservice.PropertiesFactoryStax.loadFromXML(PropertiesFactoryStax.java:46) at com.arjuna.common.util.propertyservice.AbstractPropertiesFactory.loadFromFile(AbstractPropertiesFactory.java:150) at com.arjuna.common.util.propertyservice.AbstractPropertiesFactory.getPropertiesFromFile(AbstractPropertiesFactory.java:102) at com.arjuna.common.util.propertyservice.AbstractPropertiesFactory.initDefaultProperties(AbstractPropertiesFactory.java:196) at com.arjuna.common.util.propertyservice.AbstractPropertiesFactory.getDefaultProperties(AbstractPropertiesFactory.java:62) at com.arjuna.common.util.propertyservice.PropertiesFactory.getDefaultProperties(PropertiesFactory.java:48) at io.quarkus.narayana.jta.deployment.NarayanaJtaProcessor.build(NarayanaJtaProcessor.java:126) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at io.quarkus.deployment.ExtensionLoader$3.execute(ExtensionLoader.java:909) at io.quarkus.builder.BuildContext.run(BuildContext.java:281) at org.jboss.threads.ContextHandler$1.runWith(ContextHandler.java:18) at org.jboss.threads.EnhancedQueueExecutor$Task.run(EnhancedQueueExecutor.java:2449) at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1478) at java.base/java.lang.Thread.run(Thread.java:829) at org.jboss.threads.JBossThread.run(JBossThread.java:501) Caused by: java.lang.RuntimeException: Provider for class javax.xml.stream.XMLInputFactory cannot be created at java.xml/javax.xml.stream.FactoryFinder.findServiceProvider(FactoryFinder.java:363) ... 21 more Caused by: java.util.ServiceConfigurationError: javax.xml.stream.XMLInputFactory: Provider com.ctc.wstx.stax.WstxInputFactory not found at java.base/java.util.ServiceLoader.fail(ServiceLoader.java:589) at java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.nextProviderClass(ServiceLoader.java:1212) at java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.hasNextService(ServiceLoader.java:1221) at java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.hasNext(ServiceLoader.java:1265) at java.base/java.util.ServiceLoader$2.hasNext(ServiceLoader.java:1300) at java.base/java.util.ServiceLoader$3.hasNext(ServiceLoader.java:1385) at java.xml/javax.xml.stream.FactoryFinder$1.run(FactoryFinder.java:348) at java.base/java.security.AccessController.doPrivileged(Native Method) at java.xml/javax.xml.stream.FactoryFinder.findServiceProvider(FactoryFinder.java:337) ... 21 more ERROR: Provider for class javax.xml.stream.XMLInputFactory cannot be created ERROR: Provider for class javax.xml.stream.XMLInputFactory cannot be created ERROR: javax.xml.stream.XMLInputFactory: Provider com.ctc.wstx.stax.WstxInputFactory not found ```

I can't tell immediately if it means that the classes I am including in this jar, i.e. woodstox, are being used not only by other classes included in the jar but also (unintentionally) being used by the whole of keycloak, or if it means I need to further change the configuration to relocate the SPI declarations as well. I'm assuming the latter.

Edit/Update: three things helped. First, woodstox implements several standard SPIs that already have their own implementations on the classpath in keycloak, so I excluded woodstox. Second, I found the commons-logging jboss bridge/adapter in keycloak so I excluded commons-logging. Third, a function in the shadow jar mergeServiceFiles() makes sure that the SPI descriptors get relocated too. When looking inside the jar now (jars are zip files), there is only one root package of org.philanthropydatacommons, with two packages in there, auth (the code here), and shadow (all the dependencies), and the contents of META-INF/services refer only to what we want: org.keycloak.authentication.AuthenticatorFactory (the keycloak SPI implemented by this software) and the rest are shaded as org.philanthropydatacommons.shadow.[yadayadayada] (avoiding conflict with other SPIs on the keycloak classpath).

kfogel commented 1 year ago

(Appreciating the detailed notes here, @bickelj, FWIW!)

bickelj commented 1 year ago

Step 5 is (more) done with the merge of https://github.com/PhilanthropyDataCommons/auth/pull/3, the next step is "Clean up, expand use of, and merge the code in https://github.com/PhilanthropyDataCommons/service/pull/124 so that we can enable use by the service.".

kfogel commented 1 year ago

Cool! Thanks, @bickelj.

The quoting/linking is a bit borngled above? The reference is to service#124, if I understand correctly.

bickelj commented 1 year ago

@kfogel Fixed! That was a rushed comment. The real next step was to copy a working jar to our test and production machines to make sure we don't forget where it is and document said copying. That is in progress. Then the next step is "Clean up, expand ... service#124".

bickelj commented 1 year ago

Task number 6 (Update domain names as needed) is still ongoing. The first part, changing to api.[yada] is complete in both test and prod, but the second part for auth.[yada] still needs to be changed in prod. that is the next step.

bickelj commented 1 year ago

Task number 8 (Clean up, expand use of, and merge the code in https://github.com/PhilanthropyDataCommons/service/pull/124 so that we can enable use by the service) is also ongoing, paused on an eslint rule violation in pending test code.

bickelj commented 1 year ago

Task number 6 (Update domain names as needed) is completed. Next is task number 8 (Clean up, expand use of, and merge the code in https://github.com/PhilanthropyDataCommons/service/pull/124 so that we can enable use by the service).

bickelj commented 1 year ago

While documenting and doing task 8 (Clean up, expand use of, and merge the code in https://github.com/PhilanthropyDataCommons/service/pull/124 so that we can enable use by the service), I found what I just shared in https://github.com/PhilanthropyDataCommons/deploy/issues/49#issuecomment-1457231323, and the next step is in the same. See update in the post.

I think back in November when developing the capability on the service side, I was running keycloak in a container, but I was accessing the auth services from outside the container, i.e. running via npm on localhost:3000. When actually integrating within docker, the service needs to be able to validate a JWT and issuer using the same URLs as the caller from outside the docker overlay.

bickelj commented 1 year ago

Task 8 is stuck on a jest/supertest/jwks-mock issue, but task 9 is now there in the deploy repo and in test environment.

bickelj commented 1 year ago

I checked task (9) as done (even though it is still a manual delivery step) because the jars are in production and configured. I created a test user and did a login flow that correctly sent me an SMS OTP.

bickelj commented 1 year ago

2FA is live in the pilot as of this week.

jasonaowen commented 1 year ago

@bickelj is there anything more you want to do as part of this issue? Or should we close it as completed?

bickelj commented 1 year ago

I think steps (10) and (12) are still useful, but we can consider those out of scope for the purposes of this issue.

    • [ ] Document exact steps to create the pdc realm, configuration of SMS OTP extension, clients, etc., or figure out how to export and import (automatically or from source) the configuration after doing it locally.
    • [ ] Allow use of the same keycloak extension from (5) in the password reset flow (optional for now).

Step (10) is now tracked in service#290. Step (12) should emerge naturally and can be in a separate ticket.