nerdclub-tfg / signal-bot

A bot for signal - NOT MAINTENANCE
GNU Affero General Public License v3.0
28 stars 6 forks source link

Could not derive key #21

Open tcurdt opened 6 years ago

tcurdt commented 6 years ago

I am trying to run the bot but looks like I am having problems connecting to the push servers:

$ java -jar build/libs/signal-bot-all.jar 
Url (or 'production' or 'staging' for whispersystems' server):
staging
Phone Number:
+491xxxxxxxxx
Device type, one of 'primary' (new registration) or 'secondary' (linking):
primary
Exception in thread "main" org.whispersystems.signalservice.api.push.exceptions.PushNetworkException: javax.net.ssl.SSLException: java.security.ProviderException: Could not derive key
  at org.whispersystems.signalservice.internal.push.PushServiceSocket.getConnection(PushServiceSocket.java:595)
  at org.whispersystems.signalservice.internal.push.PushServiceSocket.makeRequest(PushServiceSocket.java:491)
  at org.whispersystems.signalservice.internal.push.PushServiceSocket.createAccount(PushServiceSocket.java:114)
  at org.whispersystems.signalservice.api.SignalServiceAccountManager.requestSmsVerificationCode(SignalServiceAccountManager.java:113)
  at de.thoffbauer.signal4j.SignalService.startConnectAsPrimary(SignalService.java:147)
  at de.nerdclubtfg.signalbot.components.SignalConnection.<init>(SignalConnection.java:43)
  at de.nerdclubtfg.signalbot.SignalBot.start(SignalBot.java:24)
  at de.nerdclubtfg.signalbot.SignalBot.main(SignalBot.java:74)
Caused by: javax.net.ssl.SSLException: java.security.ProviderException: Could not derive key
  at sun.security.ssl.Alerts.getSSLException(Alerts.java:208)
  at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1949)
  at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1906)
  at sun.security.ssl.SSLSocketImpl.handleException(SSLSocketImpl.java:1889)
  at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1410)
  at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387)
  at com.squareup.okhttp.Connection.upgradeToTls(Connection.java:241)
  at com.squareup.okhttp.Connection.connect(Connection.java:158)
  at com.squareup.okhttp.Connection.connectAndSetOwner(Connection.java:174)
  at com.squareup.okhttp.OkHttpClient$1.connectAndSetOwner(OkHttpClient.java:120)
  at com.squareup.okhttp.internal.http.RouteSelector.next(RouteSelector.java:131)
  at com.squareup.okhttp.internal.http.HttpEngine.connect(HttpEngine.java:312)
  at com.squareup.okhttp.internal.http.HttpEngine.sendRequest(HttpEngine.java:235)
  at com.squareup.okhttp.Call.getResponse(Call.java:262)
  at com.squareup.okhttp.Call$ApplicationInterceptorChain.proceed(Call.java:219)
  at com.squareup.okhttp.Call.getResponseWithInterceptorChain(Call.java:192)
  at com.squareup.okhttp.Call.execute(Call.java:79)
  at org.whispersystems.signalservice.internal.push.PushServiceSocket.getConnection(PushServiceSocket.java:593)
  ... 7 more
Caused by: java.security.ProviderException: Could not derive key
  at sun.security.ec.ECDHKeyAgreement.engineGenerateSecret(ECDHKeyAgreement.java:133)
  at sun.security.ec.ECDHKeyAgreement.engineGenerateSecret(ECDHKeyAgreement.java:163)
  at javax.crypto.KeyAgreement.generateSecret(KeyAgreement.java:648)
  at sun.security.ssl.ECDHCrypt.getAgreedSecret(ECDHCrypt.java:101)
  at sun.security.ssl.ClientHandshaker.serverHelloDone(ClientHandshaker.java:1067)
  at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:348)
  at sun.security.ssl.Handshaker.processLoop(Handshaker.java:979)
  at sun.security.ssl.Handshaker.process_record(Handshaker.java:914)
  at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1062)
  at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375)
  at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403)
  ... 20 more
Caused by: java.security.InvalidAlgorithmParameterException
  at sun.security.ec.ECDHKeyAgreement.deriveKey(Native Method)
  at sun.security.ec.ECDHKeyAgreement.engineGenerateSecret(ECDHKeyAgreement.java:130)
  ... 30 more
$ java -version
java version "1.8.0_66"
Java(TM) SE Runtime Environment (build 1.8.0_66-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode)
mruddy commented 6 years ago

While this reply may not be very satisfying, I get the same failure with the JRE that comes with the Oracle JDK. The JRE that comes with OpenJDK works. It appears that the problem is occurring within the TLS handshake. It is NOT due to a certificate error. It does NOT appear to be related to the unlimited strength JCE policy files being installed (installing them did not make the Oracle JRE start working). -Djdk.tls.client.protocols=TLSv1 did NOT seem to help. -Dhttps.protocols=TLSv1 did NOT seem to help. -Djdk.tls.ephemeralDHKeySize=legacy did NOT seem to help. What I can see so far with WireShark is that the failure happens during the TLS handshake when the client sends a fatal alert just after sending its "Client Key Exchange" handshake protocol message.

Fails:

$ /opt/jdk1.8.0_172/bin/java -version
java version "1.8.0_172"
Java(TM) SE Runtime Environment (build 1.8.0_172-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.172-b11, mixed mode)

Works:

$ /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -version
openjdk version "1.8.0_171"
OpenJDK Runtime Environment (build 1.8.0_171-8u171-b11-0ubuntu0.18.04.1-b11)
OpenJDK 64-Bit Server VM (build 25.171-b11, mixed mode)

Commands used for testing:

/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Djavax.net.debug=all -jar build/libs/signal-bot-all.jar

/opt/jdk1.8.0_172/bin/java -Djavax.net.debug=all -jar build/libs/signal-bot-all.jar

tcurdt commented 6 years ago

Thanks for digging into this. The findings are indeed not very satisfying. I remember when searching for this error that isn't just a problem for this project.

I found one more suggestion:

In java 8 there are missing old ssl handshakes. Adding Bouncy Castle and adding to Security providers it with following code

Security.addProvider(new BouncyCastleProvider());

Not sure if that really make sense though.

mruddy commented 6 years ago

I hooked up a remote java debugger against the process when running an Oracle JRE like this: /opt/jdk1.8.0_171/jre/bin/java -Xrunjdwp:transport=dt_socket,address=8787,server=y,suspend=y -jar build/libs/signal-bot-all.jar

The exception logged on the console:

Caused by: java.security.ProviderException: Could not derive key
    at sun.security.ec.ECDHKeyAgreement.engineGenerateSecret(ECDHKeyAgreement.java:133)
  ...

The type of the exception being caught at that line is java.security.InvalidAlgorithmParameterException. That might narrow down the problem in the Oracle native code to a line like this one from OpenJDK: http://hg.openjdk.java.net/jdk8u/jdk8u60/jdk/file/935758609767/src/share/native/sun/security/ec/ECC_JNI.cpp#l425

Here's what I think the relevant java source looks like (again, from OpenJDK): http://hg.openjdk.java.net/jdk8u/jdk8u60/jdk/file/935758609767/src/share/classes/sun/security/ec/ECDHKeyAgreement.java#l130

http://hg.openjdk.java.net/jdk8u/jdk8u60/jdk/file/935758609767/src/share/classes/sun/security/ec/ECDHKeyAgreement.java#l133

Turakar commented 6 years ago

Unfortunately, I cannot help you as an OpenJDK user. Using a different implementation than sun.security might fix the problem as you do not anymore depend on platform specific implementation issues. If you still want to track the issue down, I would suggest writing a minimal code example failing in the same way. As it seems to fail at a TLS handshake, you do not depend on Signal's protocol here, making things way easier. Maybe it is enough to just create a TLS connection to Whispersystems' servers in the same way the library does. If you have this testing case, you can reach a broader community either on stackoverflow or by filing an issue against Oracle's implementation.

mruddy commented 6 years ago

Yep, I agree. I think it needs to be reported as a Oracle specific JRE bug with a simplified test case to reproduce. Here's a way to get it working while using the Oracle JRE.

This works: /opt/jdk1.8.0_172/jre/bin/java -cp ./bcprov-jdk15on-160.jar:./build/libs/signal-bot-all.jar de.nerdclubtfg.signalbot.SignalBot

This fails: /opt/jdk1.8.0_172/jre/bin/java -cp ./build/libs/signal-bot-all.jar:./bcprov-jdk15on-160.jar de.nerdclubtfg.signalbot.SignalBot

What I'm doing here is getting the bouncycastle provider classes earlier on the classpath so that they get used with preference over the sun.security.ec provider's classes. That's the only difference. I verified in the debugger which provider classes were being used. No changes to /opt/jdk1.8.0_172/jre/lib/security/java.security were necessary.

$ /opt/jdk1.8.0_172/jre/bin/java -version
java version "1.8.0_172"
Java(TM) SE Runtime Environment (build 1.8.0_172-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.172-b11, mixed mode)

For reference, a good breakpoint to set to see which classes get used is at: javax.crypto.KeyAgreement.generateSecret(String). The algorithm argument's value is "TlsPremasterSecret".

tcurdt commented 6 years ago

Holy smokes! Kudos for figuring that out @mruddy

mruddy commented 6 years ago

I was working on developing the simple test case to reproduce the bug and I noticed while debugging the buggy process that the classes from the bouncy castle provider are being used within the sun provider. I think that provides a better explanation for the root cause that still fits the research so far.

For example, in the buggy process a sun.security.ec.ECDHKeyAgreement object has a member privateKey that is of type org.bouncycastle.jcajce.provider.asymmetric.ec.BCECPrivateKey.

The BC implementation provides an encoding for the secp256r1 OID that is 227 bytes long (ASN.1 format, I believe) while the sun code says that it's expecting DER format that is either 7 or 10 bytes.

The process where I'm trying to reproduce the bug has the same sun.security.ec.ECDHKeyAgreement with a privateKey member of type sun.security.ec.ECPrivateKeyImpl which returns an encoding of only 10 bytes (I guess the 10 bytes is the OID in DER format).

The docs for java.security.AlgorithmParameters.getEncoded seem to indicate the ASN.1 is the correct encoding for that curve. Perhaps that where the disconnect is. Maybe the sun provider is not following the contract? Eh, seems like it could be classloader hell. I may work more on it later.

Here's a few more links that are useful to have open when searching the native code: http://hg.openjdk.java.net/jdk8u/jdk8u60/jdk/file/935758609767/src/share/native/sun/security/ec/impl/ecdecode.c http://hg.openjdk.java.net/jdk8u/jdk8u60/jdk/file/935758609767/src/share/native/sun/security/ec/impl/ecc_impl.h http://hg.openjdk.java.net/jdk8u/jdk8u60/jdk/file/935758609767/src/share/native/sun/security/ec/impl/ec.h

mruddy commented 6 years ago

I have a better understanding now and it's related to the mixing of provider classes.

If you use a debugger against the Oracle JRE like the following: /opt/jdk1.8.0_172/jre/bin/java -Xrunjdwp:transport=dt_socket,address=8787,server=y,suspend=y -jar build/libs/signal-bot-all.jar

Then set a breakpoint in de.thoffbauer.signal4j.SignalService.SignalService() you can step through and change the position that that code inserts the provider from 1 to the last (10 in my case).

This is the line that messes everything up: Security.insertProviderAt(new org.bouncycastle.jce.provider.BouncyCastleProvider(), 1);

In the Oracle JRE, that results in sun.security.ec.ECDHKeyAgreement.engineGenerateSecret(java.lang.String) having trouble. Basically, the curve parameters corresponding to the curve that the private key is on gets encoded. There is a difference between how BC and the sun.security.ec providers encode those curve parameters. The native code rejects the BC encoding (I think BC encodes the whole curve params and sun encodes only the OID). So, when the BC classes are referenced by the sun security ec classes, failure results.

When using OpenJDK, only the bouncy castle 1.55 provider classes get used. Also, when managing the classpath and sticking the BC jar first, only the BC provider classes are used. That explains everything.

Turakar commented 6 years ago

So inserting it at priority 10 fixes the problem for you?

mruddy commented 6 years ago

Yep. It's easy to try also if you want to verify.

Start a process with an Oracle JRE like: /opt/jdk1.8.0_172/jre/bin/java -Xrunjdwp:transport=dt_socket,address=8787,server=y,suspend=y -jar build/libs/signal-bot-all.jar

Then attach an eclipse remote java app debugger with a configuration like: debugger-config

Set a breakpoint at the line in de.thoffbauer.signal4j.SignalService.SignalService(): Security.insertProviderAt(new org.bouncycastle.jce.provider.BouncyCastleProvider(), 1);

Then step into that method call and change the argument value from 1 to 10 (or whatever is one more than the number of providers that are already in the list -- I have 9, so I set it to 10).

mruddy commented 6 years ago

0 also works as that puts it at the end of the provider list as well. That is what Security.addProvider does.

See also: http://www.bouncycastle.org/wiki/display/JA1/Provider+Installation

"It is possible to add the provider higher up in the list. If you do this we recommend you don't add it earlier than position 2 as there are occasionally internal dependencies on the provider at position 1 which may cause some operations by your JVM to result in errors."