opensearch-project / security

🔐 Secure your cluster with TLS, numerous authentication backends, data masking, audit logging as well as role-based access control on indices, documents, and fields
https://opensearch.org/docs/latest/security-plugin/index/
Apache License 2.0
190 stars 272 forks source link

[Bug] SSLHandshakeException: Insufficient buffer remaining for AEAD cipher fragment #3299

Closed peternied closed 7 months ago

peternied commented 1 year ago
Seeing error `javax.net.ssl.SSLHandshakeException: Insufficient buffer remaining for AEAD cipher fragment (2). Needs to be more than tag size (16)` during OpenSearch startup
``` Error: 9-04T06:39:28,837][ERROR][o.o.s.s.t.SecuritySSLNettyTransport] [smoketestnode] Exception during establishing a SSL connection: javax.net.ssl.SSLHandshakeException: Insufficient buffer remaining for AEAD cipher fragment (2). Needs to be more than tag size (16) javax.net.ssl.SSLHandshakeException: Insufficient buffer remaining for AEAD cipher fragment (2). Needs to be more than tag size (16) at sun.security.ssl.Alert.createSSLException(Alert.java:131) ~[?:?] at sun.security.ssl.TransportContext.fatal(TransportContext.java:360) ~[?:?] at sun.security.ssl.TransportContext.fatal(TransportContext.java:303) ~[?:?] at sun.security.ssl.TransportContext.fatal(TransportContext.java:298) ~[?:?] at sun.security.ssl.SSLTransport.decode(SSLTransport.java:134) ~[?:?] at sun.security.ssl.SSLEngineImpl.decode(SSLEngineImpl.java:681) ~[?:?] at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:636) ~[?:?] at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:454) ~[?:?] at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:433) ~[?:?] at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:637) ~[?:?] at io.netty.handler.ssl.JdkSslEngine.unwrap(JdkSslEngine.java:92) ~[netty-handler-4.1.97.Final.jar:4.1.97.Final] at io.netty.handler.ssl.JdkAlpnSslEngine.unwrap(JdkAlpnSslEngine.java:163) ~[netty-handler-4.1.97.Final.jar:4.1.97.Final] at io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:309) ~[netty-handler-4.1.97.Final.jar:4.1.97.Final] at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1436) ~[netty-handler-4.1.97.Final.jar:4.1.97.Final] at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1329) ~[netty-handler-4.1.97.Final.jar:4.1.97.Final] at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1378) ~[netty-handler-4.1.97.Final.jar:4.1.97.Final] at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:529) ~[netty-codec-4.1.97.Final.jar:4.1.97.Final] at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:468) ~[netty-codec-4.1.97.Final.jar:4.1.97.Final] at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290) ~[netty-codec-4.1.97.Final.jar:4.1.97.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) [netty-transport-4.1.97.Final.jar:4.1.97.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) [netty-transport-4.1.97.Final.jar:4.1.97.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) [netty-transport-4.1.97.Final.jar:4.1.97.Final] at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1[410](https://github.com/opensearch-project/security/actions/runs/6069984984/job/16465215605?pr=3296#step:8:423)) [netty-transport-4.1.97.Final.jar:4.1.97.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440) [netty-transport-4.1.97.Final.jar:4.1.97.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:[420](https://github.com/opensearch-project/security/actions/runs/6069984984/job/16465215605?pr=3296#step:8:433)) [netty-transport-4.1.97.Final.jar:4.1.97.Final] at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [netty-transport-4.1.97.Final.jar:4.1.97.Final] at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) [netty-transport-4.1.97.Final.jar:4.1.97.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788) [netty-transport-4.1.97.Final.jar:4.1.97.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:689) [netty-transport-4.1.97.Final.jar:4.1.97.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:652) [netty-transport-4.1.97.Final.jar:4.1.97.Final] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562) [netty-transport-4.1.97.Final.jar:4.1.97.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) [netty-common-4.1.97.Final.jar:4.1.97.Final] at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.97.Final.jar:4.1.97.Final] at java.lang.Thread.run(Thread.java:829) [?:?] Caused by: javax.crypto.BadPaddingException: Insufficient buffer remaining for AEAD cipher fragment (2). Needs to be more than tag size (16) at sun.security.ssl.SSLCipher$T13GcmReadCipherGenerator$GcmReadCipher.decrypt(SSLCipher.java:1894) ~[?:?] at sun.security.ssl.SSLEngineInputRecord.decodeInputRecord(SSLEngineInputRecord.java:240) ~[?:?] at sun.security.ssl.SSLEngineInputRecord.decode(SSLEngineInputRecord.java:197) ~[?:?] at sun.security.ssl.SSLEngineInputRecord.decode(SSLEngineInputRecord.java:160) ~[?:?] at sun.security.ssl.SSLTransport.decode(SSLTransport.java:111) ~[?:?] ... 29 more ```

Expected result

Should not see errors from underlying system configuration

Additional context

willyborankin commented 1 year ago

Known issue in JDK: https://bugs.openjdk.org/browse/JDK-8221218. Maybe it's been resolved in JDK20

waza-ari commented 1 year ago

I have the same issue using the latest helm charts and docker images. interestingly it worked for a while, after re-creating the CA and certs it stopped working consistently.

willyborankin commented 1 year ago

Got the same issue. During cluster migration from 2.8 to 2.9 one of the node could not start. What is the root cause so far is not clear.

stephen-crawford commented 1 year ago

[Triage] Going to leave this untriaged since we dont really know how to move forward yet. We can keep the issue though and add more info if we encounter this further.

stephen-crawford commented 1 year ago

[Triage] Per @willyborankin's suggestion, you can reproduce it by starting a migration and adding a new node during migration with the same certificate. Any fixes for the issues will be accepted. Likely a change around 1.7.6 or jdk20.

willyborankin commented 11 months ago

PR with BC 1.76 was merged in OpenSearch.

LHozzan commented 9 months ago

Hi guys. Problem is still persistent in v2.11.0. I would like to kindly ask you let us know, when fix will be available in particular version.

Thrallix commented 9 months ago

Also having this issue using latest tag. Note that this rule is off: plugins.security.ssl.transport.enforce_hostname_verification: false

And i am using proper plugins.security.nodes_dn settings.

VovkaSOL commented 8 months ago

bug not resolved (15.01.2024), use tls 1.2 instead tls 1.3 use VM arg: -Djdk.tls.client.protocols=TLSv1.2 or if you use netty config ssl handler: SslHandler handler = sslContext.newHandler(socketChannel.alloc()); handler.engine().setEnabledProtocols(new String[] {"TLSv1.2"});

DarshitChanpura commented 7 months ago

Seems like a bug in JDK: https://bugs.openjdk.java.net/browse/JDK-8221218

See this forum post for more details: https://forum.opensearch.org/t/cluster-does-not-initialize-javax-net-ssl-sslhandshakeexception-insufficient-buffer-remaining-for-aead-cipher-fragment/2845/5

stephen-crawford commented 7 months ago

Like others have said this seems to be a known issue with how the JDK handles TLS:

https://bugs.openjdk.org/browse/JDK-8221218

If you look at the comments here, they seem to suggest fixes have occurred but obviously this is not the case... It is also worth pointing out that neither of the fixes were actually intended to address this specific issue. I am not sure why they closed this issue as resolved when the linked changes were for separate bugs...

Further examples of the issue being known:

Oracle support page (https://support.oracle.com/knowledge/Middleware/2519569_1.html)

Applies to: Oracle WebLogic Server - Version 12.1.3.0.0 and later

Another project running into this issue:

https://forum.portswigger.net/thread/complete-proxy-failure-due-to-java-tls-bug-1e334581

Thanks for reporting this. It is a known unresolved bug in OpenJDK

One last attempt to fix this would be looking at increasing the Bouncycastle version:

https://github.com/tkohegyi/mitmJavaProxy/issues/12

I use JDK15 and later + org.bouncycastle/bcpkix-jdk18on/1.71.1 and I cannot repro it anymore

I will try to do this and see if it is possible but I am not sure about reproducing the issue consistently so it may be challenging to test.

peternied commented 7 months ago

@LHozzan @Thrallix @VovkaSOL We've been having no luck with this issue, one thing I'm trying to understand is how impactful this issue is to you. From our evidence it looks like this has only happened during cluster startup. If its a startup issue is unfortunate, but limited in overall impact. Whereas - if this issue happens intermittently on a cluster and takes down a node then we should invest more time, can you help provide use with details of your reproduction?

reshippie commented 7 months ago

I am seeing this issue consistently after trying to change cert providers. I did a full cluster restart and I'm getting that error on all of my nodes. I don't know if it's relevant but the old certs we were using were RSA, while the new certs are id-ecPublicKey

peternied commented 7 months ago

@reshippie (any anyone else experience this issue) could you include the operation system version / jdk version / opensearch distro version. Basic cluster topology (3 data nodes, 2 cluster managers). Anything interesting about your security configuration.

If you don't feel conformable posting that information publicly, feel free to reach out to me first on our slack instance, I'm Peter Nied or email pet ern @ am az on .co m (remove the spaces)

reshippie commented 7 months ago

We're running: Debian 10.13 Opensearch 2.9.0 bundled Java 17.0.7 6 data nodes, 3 managers, 1 coordinating node (for Dashboards)

I don't think there's anything interesting in our security config

plugins.security.ssl_cert_reload_enabled: true
plugins.security.ssl.transport.enforce_hostname_verification: false
plugins.security.advanced_modules_enabled: true
plugins.security.nodes_dn:
  - 'CN=dashboards-*-mgmt'
  - 'CN=esmaster-*-mgmt'
  - 'CN=elasticsearch-*-mgmt'
  - 'CN=osdata-*-mgmt'
 # Trasnport layer TLS
plugins.security.ssl.transport.enabled: true
plugins.security.ssl.transport.pemkey_filepath: ssl/{{ ansible_hostname }}-mgmt.pk8
plugins.security.ssl.transport.pemcert_filepath: ssl/{{ ansible_hostname }}-mgmt.crt
plugins.security.ssl.transport.pemtrustedcas_filepath: ssl/{{ ansible_hostname }}-mgmt.issuer.crt
plugins.security.ssl.transport.truststore_filepath: cacerts
#
# REST layer TLS
plugins.security.ssl.http.enabled: true
plugins.security.ssl.http.pemkey_filepath: ssl/{{ ansible_hostname }}-mgmt.pk8
plugins.security.ssl.http.pemcert_filepath: ssl/{{ ansible_hostname }}-mgmt.crt
plugins.security.ssl.http.pemtrustedcas_filepath: ssl/{{ ansible_hostname }}-mgmt.issuer.crt
plugins.security.restapi.roles_enabled: ["admin_role", "security_rest_api_access"]
plugins.security.authcz.admin_dn: CN=DOMAIN.org

I tried the solution posted by @VovkaSOL. Adding -Djdk.tls.client.protocols=TLSv1.2 did not make the error go away.

stephen-crawford commented 7 months ago

I looked into updating the bouncycastle version as mentioned above. We would need to follow something similar to when it was moved to https://github.com/opensearch-project/OpenSearch/pull/8247

At the time, @willyborankin only bumped to 15to18 because of the multi-release jars. I don't know if it feasible to move past that point/if opensearch can handle the later version. @willyborankin do you know?

willyborankin commented 7 months ago

I looked into updating the bouncycastle version as mentioned above. We would need to follow something similar to when it was moved to opensearch-project/OpenSearch#8247

At the time, @willyborankin only bumped to 15to18 because of the multi-release jars. I don't know if it feasible to move past that point/if opensearch can handle the later version. @willyborankin do you know?

@scrawfor99 Not sure about it, we still support JDK 1.8 build AFAIK.

stephen-crawford commented 7 months ago

@willyborankin, I think 18on will still work with 1.8. I saw you made the swap to 15to18 though and not 18on in the linked PR so was not sure whether you knew what was or was not compatible.

stephen-crawford commented 7 months ago

With the updates the bouncy castle, I am going to close this issue as this is the most we can currently do to resolve the exception. Based on some other discussions, the update to bouncy castle should help resolve the failures.

LHozzan commented 7 months ago

Hi @peternied .

Sorry for delay response.

We've been having no luck with this issue, one thing I'm trying to understand is how impactful this issue is to you. From our evidence it looks like this has only happened during cluster startup. If its a startup issue is unfortunate, but limited in overall impact. Whereas - if this issue happens intermittently on a cluster and takes down a node then we should invest more time, can you help provide use with details of your reproduction?

This problem in our infrastructure occurring random on all nodes roles. If problem occurred only on one coordinator node, second replica is working, but if both replicas are hitting by the problem, there are basically complete cluster useless, no matter, that managers and data nodes are working fine. Same situation, if any another roles are affected in same time or with some delay. We have monitoring and watching, if components before OpenSearch cluster can connect to it, but it is inconvenient.

We actually using default community Docker image opensearchproject/opensearch:2.11.1, but only little time. We have actually clusters only in AWS and M$ and I can observe same problem on both providers.

Basic cluster topology (3 data nodes, 2 cluster managers). Anything interesting about your security configuration.

The problem occurring in our both using setups. I mean:

Based on my observation it seems, that more often occurring on multirole, but I not have any exact data.

@scrawfor99 OK, lets wait for next release (2.12.x) and hopefully problem will be fixed there. If it will be persistent, I will let you know.

willyborankin commented 7 months ago

Hi @LHozzan, do you use Wireguard/IPSec as an addition encryption mechanism for the communication between nodes? If yes the problem could be related to Wireguard/IPSec configurtaion

malayh commented 5 months ago

After installation(2 data node, 1 manager node) with the demo config, I have updated the opensearch.yml with the following

plugins.security.ssl.transport.pemcert_filepath: tls.crt
plugins.security.ssl.transport.pemkey_filepath: tls.key
plugins.security.ssl.transport.pemtrustedcas_filepath: ca.crt
plugins.security.ssl.transport.enforce_hostname_verification: false
plugins.security.ssl.http.enabled: true
plugins.security.ssl.http.pemcert_filepath: tls.crt
plugins.security.ssl.http.pemkey_filepath: tls.key
plugins.security.ssl.http.pemtrustedcas_filepath: ca.crt
plugins.security.allow_unsafe_democertificates: false
plugins.security.allow_default_init_securityindex: true
plugins.security.authcz.admin_dn: ['CN=admin']
plugins.security.audit.type: internal_opensearch
plugins.security.enable_snapshot_restore_privilege: true
plugins.security.check_snapshot_restore_write_privileges: true
plugins.security.restapi.roles_enabled: [all_access, security_rest_api_access]
plugins.security.system_indices.enabled: true
plugins.security.system_indices.indices:
    - .plugins-ml-agent
    - .plugins-ml-config
    - .plugins-ml-connector
    - .plugins-ml-controller
    - .plugins-ml-model-group
    - .plugins-ml-model
    - .plugins-ml-task
    - .plugins-ml-conversation-meta
    - .plugins-ml-conversation-interactions
    - .plugins-ml-memory-meta
    - .plugins-ml-memory-message
    - .plugins-ml-stop-words
    - .opendistro-alerting-config
    - .opendistro-alerting-alert*
    - .opendistro-anomaly-results*
    - .opendistro-anomaly-detector*
    - .opendistro-anomaly-checkpoints
    - .opendistro-anomaly-detection-state
    - .opendistro-reports-*
    - .opensearch-notifications-*
    - .opensearch-notebooks
    - .opensearch-observability
    - .ql-datasources
    - .opendistro-asynchronous-search-response*
    - .replication-metadata-store
    - .opensearch-knn-models
    - .geospatial-ip2geo-data*
    - .plugins-flow-framework-config
    - .plugins-flow-framework-templates
    - .plugins-flow-framework-state
plugins.security.ssl.http.enabled_protocols:
  - "TLSv1.2"
plugins.security.nodes_dn:
  - 'CN=node'

Then I ran

/usr/share/opensearch/plugins/opensearch-security/tools/securityadmin.sh -icl -nhnv \
-cd "/usr/share/opensearch/config/opensearch-security" \
-key "/usr/share/opensearch/config/kirk-key.pem" \
-cert "/usr/share/opensearch/config/kirk.pem" \
-cacert "/usr/share/opensearch/config/root-ca.pem"

After that point, I keep getting errors.

The following makefile generates my keys

keys/root-ca.key:
    mkdir -p keys;
    openssl genrsa -out keys/root-ca.key 2048;
keys/ca.crt: keys/root-ca.key
    openssl req -new -x509 -sha256 -key keys/root-ca.key -out keys/ca.crt -days 730 -subj "/CN=ca.local";

keys/admin.key:
    mkdir -p keys;
    openssl genrsa -out keys/admin-temp.key 2048;
    openssl pkcs8 -inform PEM -outform PEM -in keys/admin-temp.key -topk8 -nocrypt -v1 PBE-SHA1-3DES -out keys/admin.key
    rm keys/admin-temp.key; 
keys/admin.crt: keys/admin.key keys/ca.crt keys/root-ca.key
    openssl req -new -key keys/admin.key -out keys/admin.csr -subj "/CN=admin";
    openssl x509 -req -in keys/admin.csr -CA keys/ca.crt -CAkey keys/root-ca.key -CAcreateserial -sha256 -out keys/admin.crt -days 730;
    rm keys/admin.csr;

keys/tls.key:
    openssl genrsa -out keys/tls-temp.key 2048;
    openssl pkcs8 -inform PEM -outform PEM -in keys/tls-temp.key -topk8 -nocrypt -v1 PBE-SHA1-3DES -out keys/tls.key
    rm keys/tls-temp.key;
keys/tls.crt: keys/tls.key keys/ca.crt keys/root-ca.key
    openssl req -new -key keys/tls.key -out keys/tls.csr -subj "/CN=node";
    openssl x509 -req -in keys/tls.csr -CA keys/ca.crt -CAkey keys/root-ca.key -CAcreateserial -sha256 -out keys/tls.crt -days 730;
    rm keys/tls.csr;
removeoldkeys:
    rm -rf keys;
makekeys: removeoldkeys keys/admin.key keys/admin.crt keys/tls.key keys/tls.crt keys/ca.crt
    @echo "Keys are generated.";

I am stuck here for a while, please help! 🙏

smlx commented 3 months ago

I'm seeing errors like this in master node logs:

[2024-06-05T01:05:39,152][INFO ][o.o.s.a.s.DebugSink      ] [opensearch-cluster-master-2] AUDIT_LOG: {
  "audit_node_id" : "lP5ZYpVDR1O9n8EDWhKe1g",
  "audit_request_layer" : "TRANSPORT",
  "audit_request_exception_stacktrace" : "javax.net.ssl.SSLHandshakeException: Insufficient buffer remaining for AEAD cipher fragment (2). Needs to be more than tag size (16)\n\tat java.base/sun.security.ssl.Alert.createSSLException(Alert.java:130)\n\tat java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:378)\n\tat java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:321)\n\tat java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:316)\n\tat java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:134)\n\tat java.base/sun.security.ssl.SSLEngineImpl.decode(SSLEngineImpl.java:736)\n\tat java.base/sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:691)\n\tat java.base/sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:506)\n\tat java.base/sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:482)\n\tat java.base/javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:679)\n\tat io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:310)\n\tat io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1445)\n\tat io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1338)\n\tat io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1387)\n\tat io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:530)\n\tat io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:469)\n\tat io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)\n\tat io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)\n\tat io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)\n\tat io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)\n\tat io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:689)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:652)\n\tat io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)\n\tat io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)\n\tat io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)\n\tat java.base/java.lang.Thread.run(Thread.java:1583)\nCaused by: javax.crypto.BadPaddingException: Insufficient buffer remaining for AEAD cipher fragment (2). Needs to be more than tag size (16)\n\tat java.base/sun.security.ssl.SSLCipher$T13GcmReadCipherGenerator$GcmReadCipher.decrypt(SSLCipher.java:1864)\n\tat java.base/sun.security.ssl.SSLEngineInputRecord.decodeInputRecord(SSLEngineInputRecord.java:239)\n\tat java.base/sun.security.ssl.SSLEngineInputRecord.decode(SSLEngineInputRecord.java:196)\n\tat java.base/sun.security.ssl.SSLEngineInputRecord.decode(SSLEngineInputRecord.java:159)\n\tat java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:111)\n\t... 27 more\n",
  "@timestamp" : "2024-06-05T01:00:55.484+00:00",
  "audit_request_effective_user_is_admin" : false,
  "audit_cluster_name" : "opensearch-cluster",
  "audit_format_version" : 4,
  "audit_node_host_address" : "10.200.2.124",
  "audit_node_name" : "opensearch-cluster-master-2",
  "audit_category" : "SSL_EXCEPTION",
  "audit_request_origin" : "TRANSPORT",
  "audit_node_host_name" : "10.200.2.124"
}

Here's the expanded stack trace:

javax.net.ssl.SSLHandshakeException: Insufficient buffer remaining for AEAD cipher fragment (2). Needs to be more than tag size (16)
    at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:130)
    at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:378)
    at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:321)
    at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:316)
    at java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:134)
    at java.base/sun.security.ssl.SSLEngineImpl.decode(SSLEngineImpl.java:736)
    at java.base/sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:691)
    at java.base/sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:506)
    at java.base/sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:482)
    at java.base/javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:679)
    at io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:310)
    at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1445)
    at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1338)
    at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1387)
    at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:530)
    at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:469)
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:689)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:652)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
    at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: javax.crypto.BadPaddingException: Insufficient buffer remaining for AEAD cipher fragment (2). Needs to be more than tag size (16)
    at java.base/sun.security.ssl.SSLCipher$T13GcmReadCipherGenerator$GcmReadCipher.decrypt(SSLCipher.java:1864)
    at java.base/sun.security.ssl.SSLEngineInputRecord.decodeInputRecord(SSLEngineInputRecord.java:239)
    at java.base/sun.security.ssl.SSLEngineInputRecord.decode(SSLEngineInputRecord.java:196)
    at java.base/sun.security.ssl.SSLEngineInputRecord.decode(SSLEngineInputRecord.java:159)
    at java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:111)
    ... 27 more

I'm using container image docker.io/opensearchproject/opensearch:2.14.0@sha256:96af4ace999e20f3f74b1675e501d7dba46f2e7c185cfcffd4626898b00e6743 on linux/arm64.

I don't think this is fixed. Could someone please re-open?

farhadson commented 2 months ago

same error happened here but what I've done that caused this error was using a Cert with SANS for all my cluster nodes... I've used this kind of Cert for other services without any problems...I hope that you guys fix this issue!