Open tcurdt opened 6 years ago
While this reply may not be very satisfying, I get the same failure with the JRE that comes with the Oracle JDK. The JRE that comes with OpenJDK works.
It appears that the problem is occurring within the TLS handshake. It is NOT due to a certificate error. It does NOT appear to be related to the unlimited strength JCE policy files being installed (installing them did not make the Oracle JRE start working). -Djdk.tls.client.protocols=TLSv1
did NOT seem to help. -Dhttps.protocols=TLSv1
did NOT seem to help. -Djdk.tls.ephemeralDHKeySize=legacy
did NOT seem to help. What I can see so far with WireShark is that the failure happens during the TLS handshake when the client sends a fatal alert just after sending its "Client Key Exchange" handshake protocol message.
Fails:
$ /opt/jdk1.8.0_172/bin/java -version
java version "1.8.0_172"
Java(TM) SE Runtime Environment (build 1.8.0_172-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.172-b11, mixed mode)
Works:
$ /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -version
openjdk version "1.8.0_171"
OpenJDK Runtime Environment (build 1.8.0_171-8u171-b11-0ubuntu0.18.04.1-b11)
OpenJDK 64-Bit Server VM (build 25.171-b11, mixed mode)
Commands used for testing:
/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Djavax.net.debug=all -jar build/libs/signal-bot-all.jar
/opt/jdk1.8.0_172/bin/java -Djavax.net.debug=all -jar build/libs/signal-bot-all.jar
Thanks for digging into this. The findings are indeed not very satisfying. I remember when searching for this error that isn't just a problem for this project.
I found one more suggestion:
In java 8 there are missing old ssl handshakes. Adding Bouncy Castle and adding to Security providers it with following code
Security.addProvider(new BouncyCastleProvider());
Not sure if that really make sense though.
I hooked up a remote java debugger against the process when running an Oracle JRE like this:
/opt/jdk1.8.0_171/jre/bin/java -Xrunjdwp:transport=dt_socket,address=8787,server=y,suspend=y -jar build/libs/signal-bot-all.jar
The exception logged on the console:
Caused by: java.security.ProviderException: Could not derive key
at sun.security.ec.ECDHKeyAgreement.engineGenerateSecret(ECDHKeyAgreement.java:133)
...
The type of the exception being caught at that line is java.security.InvalidAlgorithmParameterException
.
That might narrow down the problem in the Oracle native code to a line like this one from OpenJDK:
http://hg.openjdk.java.net/jdk8u/jdk8u60/jdk/file/935758609767/src/share/native/sun/security/ec/ECC_JNI.cpp#l425
Here's what I think the relevant java source looks like (again, from OpenJDK): http://hg.openjdk.java.net/jdk8u/jdk8u60/jdk/file/935758609767/src/share/classes/sun/security/ec/ECDHKeyAgreement.java#l130
Unfortunately, I cannot help you as an OpenJDK user. Using a different implementation than sun.security
might fix the problem as you do not anymore depend on platform specific implementation issues. If you still want to track the issue down, I would suggest writing a minimal code example failing in the same way. As it seems to fail at a TLS handshake, you do not depend on Signal's protocol here, making things way easier. Maybe it is enough to just create a TLS connection to Whispersystems' servers in the same way the library does. If you have this testing case, you can reach a broader community either on stackoverflow or by filing an issue against Oracle's implementation.
Yep, I agree. I think it needs to be reported as a Oracle specific JRE bug with a simplified test case to reproduce. Here's a way to get it working while using the Oracle JRE.
This works:
/opt/jdk1.8.0_172/jre/bin/java -cp ./bcprov-jdk15on-160.jar:./build/libs/signal-bot-all.jar de.nerdclubtfg.signalbot.SignalBot
This fails:
/opt/jdk1.8.0_172/jre/bin/java -cp ./build/libs/signal-bot-all.jar:./bcprov-jdk15on-160.jar de.nerdclubtfg.signalbot.SignalBot
What I'm doing here is getting the bouncycastle provider classes earlier on the classpath so that they get used with preference over the sun.security.ec
provider's classes. That's the only difference. I verified in the debugger which provider classes were being used. No changes to /opt/jdk1.8.0_172/jre/lib/security/java.security
were necessary.
$ /opt/jdk1.8.0_172/jre/bin/java -version
java version "1.8.0_172"
Java(TM) SE Runtime Environment (build 1.8.0_172-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.172-b11, mixed mode)
For reference, a good breakpoint to set to see which classes get used is at: javax.crypto.KeyAgreement.generateSecret(String)
. The algorithm argument's value is "TlsPremasterSecret".
Holy smokes! Kudos for figuring that out @mruddy
I was working on developing the simple test case to reproduce the bug and I noticed while debugging the buggy process that the classes from the bouncy castle provider are being used within the sun provider. I think that provides a better explanation for the root cause that still fits the research so far.
For example, in the buggy process a sun.security.ec.ECDHKeyAgreement
object has a member privateKey
that is of type org.bouncycastle.jcajce.provider.asymmetric.ec.BCECPrivateKey
.
The BC implementation provides an encoding for the secp256r1 OID that is 227 bytes long (ASN.1 format, I believe) while the sun code says that it's expecting DER format that is either 7 or 10 bytes.
The process where I'm trying to reproduce the bug has the same sun.security.ec.ECDHKeyAgreement
with a privateKey
member of type sun.security.ec.ECPrivateKeyImpl
which returns an encoding of only 10 bytes (I guess the 10 bytes is the OID in DER format).
The docs for java.security.AlgorithmParameters.getEncoded
seem to indicate the ASN.1 is the correct encoding for that curve. Perhaps that where the disconnect is. Maybe the sun provider is not following the contract? Eh, seems like it could be classloader hell. I may work more on it later.
Here's a few more links that are useful to have open when searching the native code: http://hg.openjdk.java.net/jdk8u/jdk8u60/jdk/file/935758609767/src/share/native/sun/security/ec/impl/ecdecode.c http://hg.openjdk.java.net/jdk8u/jdk8u60/jdk/file/935758609767/src/share/native/sun/security/ec/impl/ecc_impl.h http://hg.openjdk.java.net/jdk8u/jdk8u60/jdk/file/935758609767/src/share/native/sun/security/ec/impl/ec.h
I have a better understanding now and it's related to the mixing of provider classes.
If you use a debugger against the Oracle JRE like the following:
/opt/jdk1.8.0_172/jre/bin/java -Xrunjdwp:transport=dt_socket,address=8787,server=y,suspend=y -jar build/libs/signal-bot-all.jar
Then set a breakpoint in de.thoffbauer.signal4j.SignalService.SignalService()
you can step through and change the position that that code inserts the provider from 1 to the last (10 in my case).
This is the line that messes everything up:
Security.insertProviderAt(new org.bouncycastle.jce.provider.BouncyCastleProvider(), 1);
In the Oracle JRE, that results in sun.security.ec.ECDHKeyAgreement.engineGenerateSecret(java.lang.String)
having trouble. Basically, the curve parameters corresponding to the curve that the private key is on gets encoded. There is a difference between how BC and the sun.security.ec providers encode those curve parameters. The native code rejects the BC encoding (I think BC encodes the whole curve params and sun encodes only the OID). So, when the BC classes are referenced by the sun security ec classes, failure results.
When using OpenJDK, only the bouncy castle 1.55 provider classes get used. Also, when managing the classpath and sticking the BC jar first, only the BC provider classes are used. That explains everything.
So inserting it at priority 10 fixes the problem for you?
Yep. It's easy to try also if you want to verify.
Start a process with an Oracle JRE like:
/opt/jdk1.8.0_172/jre/bin/java -Xrunjdwp:transport=dt_socket,address=8787,server=y,suspend=y -jar build/libs/signal-bot-all.jar
Then attach an eclipse remote java app debugger with a configuration like:
Set a breakpoint at the line in de.thoffbauer.signal4j.SignalService.SignalService()
:
Security.insertProviderAt(new org.bouncycastle.jce.provider.BouncyCastleProvider(), 1);
Then step into that method call and change the argument value from 1 to 10 (or whatever is one more than the number of providers that are already in the list -- I have 9, so I set it to 10).
0 also works as that puts it at the end of the provider list as well. That is what Security.addProvider
does.
See also: http://www.bouncycastle.org/wiki/display/JA1/Provider+Installation
"It is possible to add the provider higher up in the list. If you do this we recommend you don't add it earlier than position 2 as there are occasionally internal dependencies on the provider at position 1 which may cause some operations by your JVM to result in errors."
I am trying to run the bot but looks like I am having problems connecting to the push servers: