Open the-gabe opened 2 months ago
Additionally, we hacked the library to try see what was going on in the fingerprint comparison. Logs have been attached here:
https://github.com/the-gabe/elastic-failure/blob/main/appservice-hackedlib.txt (Note: for clarity w.r.t line numbers, this log was run with our actual application, not the code in the git repo)
We modified node_modules/@elastic/transport/lib/connection/UndiciConnection.js
Here is a snippet of how it looked.
if (this[symbols_1.kCaFingerprint] !== null) {
const caFingerprint = this[symbols_1.kCaFingerprint];
const connector = (0, undici_1.buildConnector)(((_a = this.tls) !== null && _a !== void 0 ? _a : {}));
undiciOptions.connect = function (opts, cb) {
connector(opts, (err, socket) => {
if (err != null) {
return cb(err, null);
}
if (caFingerprint !== null && isTlsSocket(opts, socket)) {
const issuerCertificate = (0, BaseConnection_1.getIssuerCertificate)(socket);
/* istanbul ignore next */
if (issuerCertificate == null) {
socket.destroy();
return cb(new Error('Invalid or malformed certificate'), null);
}
// Check if fingerprint matches
/* istanbul ignore else */
โโโโโโconsole.log("this is what we provided to the lib " + caFingerprint);
โโโโโโconsole.log("This is what was pulled from socket " + issuerCertificate.fingerprint256);
if (caFingerprint !== issuerCertificate.fingerprint256) {
socket.destroy();
return cb(new Error("Server certificate CA fingerprint does not match the value configured in caFingerprint"), null);
}
}
return cb(null, socket);
});
};
}
And for the sake of 100% clarity, we triple checked that "4F:57:DA:6A:80:46:C5:9F:BD:9E:49:78:BA:26:A2:FC:39:1D:32:B7:63:6C:7D:96:82:6A:1E:C5:BE:24:26:48" was valid for our CA fingerprint, we know it is as we checked several times with openssl x509 -fingerprint -sha256 -in /etc/elasticsearch/certs/http_ca.crt | grep Fingerprint
and we have other applications using this fine.
Just to rule it out: it wouldn't have anything to do with this change, would it?
Hi @JoshMock , I don't think so, the actual fingerprints taken from the socket are returning undefined.
@JoshMock We confirmed that this is not related and have tested with 8.7.0 of @elastic/transport instead of 8.7.1
Got it, didn't look at the logs close enough to see that it was undefined
. Definitely not related. ๐
Hi,
I'm also having a similar issue. I'm getting error: Unhandled Rejection at: Promise [object Promise] reason ConnectionError: Invalid or malformed certificate
with a valid caFingerprint that works in python client but in js results in the error. I'm using 8.15.0 and node 22.
Hi @JoshMock have you managed to look into this? This is impacting our production environments with this application now, and it's not a situation we are comfortable with. This is quite literally a mission critical functional of the library (being able to connect to Elasticsearch in an encrypted and authenticated fashion securely), is there any progress being made regarding this bug in a private capacity?
No action has been taken yet, @the-gabe. I'm Elastic's only active maintainer of this project, and I've been either on PTO or occupied with higher priorities for the last few weeks. I will take a look as soon as I have time.
If you need a fix more urgently, pull requests are always welcome. I am typically able to review and merge a PR within a couple of working days if it has tests and all CI checks are passing.
๐ Bug report
We have an application under development using Elasticsearch self hosted, with self signed certificates, with clients connecting using TLS and CA Fingerprints. However, we are running into what appears to be some kind of bug with the library, or potentially even Elasticsearch itself. The issue is not consistent from several hours of testing.
To reproduce
I have uploaded a repo here which is a stripped down poc of the issue based on our application code.
https://github.com/the-gabe/elastic-failure/tree/main
usage instructions:
curl -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-8.15.0-linux-x86_64.tar.gz
bsdtar xvf elasticsearch-8.15.0-linux-x86_64.tar.gz
cd elasticsearch-8.15.0
./bin/elasticsearch
note down the fingerprint and password when printed in terminal
git clone https://github.com/the-gabe/elastic-failure
cd elastic-failure
edit packages/indexer/vars.bash so that ELASTIC_QUEUE_PASSWORD , ELASTIC_VECTOR_PASSWORD , ELASTIC_QUEUE_FINGERPRINT and ELASTIC_VECTOR_FINGERPRINT reflect the password and CA fingerprint you noted down.
cd packages/indexer
npm ci --no-scripts
npm run build
bash vars.bash
observe output in terminal where both clients are able to obtain the elasticsearch version just fine. but then you get a caFingerprint failure after this. The output has been included in the root of the repo, in a file here https://github.com/the-gabe/elastic-failure/blob/main/logoutput.txt This file was created on the Arch Linux environment described below, with elasticsearch 8.15.0. On Azure App Service, we were using Elasticsearch 8.14.3-1 on RHEL 9.
I have found that this issue is reproducible around 30-40% of the time, but is a guess, and is not backed by testing. I have found that starting with a fresh elasticsearch-8.15.0 folder can help, but this may be coincidence. I suspect in a speculative fashion that it could be a race condition.
Expected behavior
this just should not happen
Node.js version
Node.js v22.6.0 on Arch Linux, v20.11.1 on Azure App Service, v20.16.0 on Debian 12
@elastic/elasticsearch version
8.15.0
Operating system
Arch Linux on WSL2, Debian 11 on Azure App Service, Debian 12
Any other relevant environment information
No response