opensearch-project / OpenSearch

🔎 Open source distributed and RESTful search engine.
https://opensearch.org/docs/latest/opensearch/index/
Apache License 2.0
9.82k stars 1.83k forks source link

[BUG] Painless - ASM/org.objectweb errors - 2.0.0 JDK17 #2664

Closed dbbaughe closed 2 years ago

dbbaughe commented 2 years ago

Describe the bug I'm getting some issues w/ running tests in Index Management for 2.0. At a certain point while running integration tests the entire cluster crashes, so I tried tracking it down to see if it's a specific test.. but it seems to be flaky and sometimes crashes and sometimes doesn't.

An example of an integ test suite run and failing and running again and passing: https://gist.github.com/dbbaughe/898d02beeb593fcf66ef97b2546a713b

It appears to be related to the org.objectweb.asm/org.ow2.asm dependencies, and I can see we recently committed some dependency upgrades for them that were also backported to the 2.0 branch.

lang-painless dependencies

From googling around.. https://github.com/apache/camel-quarkus/issues/1182 https://github.com/quarkusio/quarkus/issues/9832 https://github.com/CodeIntelligenceTesting/jazzer/issues/55 https://youtrack.jetbrains.com/issue/IDEA-245330

I was wondering if perhaps IM was pulling in the asm dependency somewhere too and it was causing issues.. or I can see that the current lang-painless module still has an asm dependency of 7.2 for the analysis one and 9.2 for everything else which might of been causing it.

But, I tried bumping that 7.2 to 9.2 and I still see the failures when running the Index Management tests. Also tried downgrading 9.2 to 7.2 and still see the failures when running the Index Management tests. And then I tried just running the lang-painless tests to see and I'm getting failures (so unrelated to Index Management) by using ./gradlew :modules:lang-painless:check https://gist.github.com/dbbaughe/a0eb025c7d9c950ea05161a58394b585

It's failing for me on JDK 17. Passing for me on JDK 14.

Had a coworker try it out (just testing OpenSearch 2.0 w/ the lang-painless check) and it's passing for them on JDK 14 and failing on JDK 17 too.

qreshi commented 2 years ago

Can confirm, I'm the coworker

nknize commented 2 years ago

I agree that we should have consistent API versions. Looks like dependabot is not that.. dependa...ble.

However, I'm unable to reproduce this in the 2.0 branch w/ JDK 17. Even after upgrading api 'org.ow2.asm:asm-analysis:9.2' locally.

> ./gradlew :modules:lang-painless:check

=======================================
OpenSearch Build Hamster says Hello!
  Gradle Version        : 7.3.3
  OS Info               : Linux 4.15.0-171-generic (amd64)
  JDK Version           : 17 (Oracle JDK)
  JAVA_HOME             : /usr/lib/jvm/java-17-oracle
  Random Testing Seed   : 904D2056BEC6B699
  In FIPS 140 mode      : false
=======================================
...
BUILD SUCCESSFUL in 3m 1s
204 actionable tasks: 147 executed, 57 up-to-date

It's odd our PR gradle checks wouldn't have caught this as well. Did you clear gradle cache? Kill any stale gradle daemons?

dbbaughe commented 2 years ago

I agree that we should have consistent API versions. Looks like dependabot is not that.. dependa...ble.

However, I'm unable to reproduce this in the 2.0 branch w/ JDK 17. Even after upgrading api 'org.ow2.asm:asm-analysis:9.2' locally.

> ./gradlew :modules:lang-painless:check

=======================================
OpenSearch Build Hamster says Hello!
  Gradle Version        : 7.3.3
  OS Info               : Linux 4.15.0-171-generic (amd64)
  JDK Version           : 17 (Oracle JDK)
  JAVA_HOME             : /usr/lib/jvm/java-17-oracle
  Random Testing Seed   : 904D2056BEC6B699
  In FIPS 140 mode      : false
=======================================
...
BUILD SUCCESSFUL in 3m 1s
204 actionable tasks: 147 executed, 57 up-to-date

It's odd our PR gradle checks wouldn't have caught this as well. Did you clear gradle cache? Kill any stale gradle daemons?

That's interesting, it passed for you on JDK 17 with asm-analysis on 7.2 and on 9.2?

I had Mohammad pull it down on to his dev desktop to test it as a way to get a second data point before nuking my caches as that always causes me other issues getting things set up again sometimes.

I guess I can now try and see what happens now since we have another data point of it passing for you..

nknize commented 2 years ago

with asm-analysis on 7.2 and on 9.2?

Yes, it passes w/ both. Although it feels uncomfortable running 7.2 when the rest are on 9.2. Let's find out if this is a 9.2 runtime bug w/ JDK 17 before making a decision to either rollback or upgrade

dbbaughe commented 2 years ago

@nknize Any chance there's a difference between the 17 JDK from Oracle and 17 JDK from OpenJDK? I can see I was using

JDK Version           : 17 (OpenJDK)
JAVA_HOME             : /local/home/dbbaughe/jdk-17.0.2

And you have

JDK Version           : 17 (Oracle JDK)
JAVA_HOME             : /usr/lib/jvm/java-17-oracle

Mohammad also tested with OpenJDK. In the meantime I am nuking caches etc. and rerunning the test on my end.

dbbaughe commented 2 years ago

Did this..

./gradlew clean && ./gradlew --stop
rm -rf ~/.gradle
rm -rf ~/.m2/repository/org/opensearch

Reran the test and still getting the failure on my end.

dblock commented 2 years ago

@dbbaughe Is the issue in org.objectweb.asm either way? Should we open a bug upstream?

»  java.lang.IllegalArgumentException: Unsupported api 589824
»       at org.objectweb.asm.MethodVisitor.<init>(MethodVisitor.java:89)
»       at org.objectweb.asm.commons.LocalVariablesSorter.<init>(LocalVariablesSorter.java:102)
»       at org.objectweb.asm.commons.GeneratorAdapter.<init>(GeneratorAdapter.java:227)
»       at org.objectweb.asm.commons.GeneratorAdapter.<init>(GeneratorAdapter.java:204)
»       at org.objectweb.asm.commons.GeneratorAdapter.<init>(GeneratorAdapter.java:245)
nknize commented 2 years ago

Any chance there's a difference between the 17 JDK from Oracle and 17 JDK from OpenJDK?

Sure it's a possibility. I'm not in a position to test that for a while (maybe @mch2 can investigate?). If that ends up being the case then it sounds like a JDK 17 issue since you're seeing this failure on both 7.2 and 9.2 but only when using 17?

dbbaughe commented 2 years ago

@dblock I'm not sure if the problem is in the dependency or not, it could just be in OpenSearch itself w/ some other dependency pulling in an older version of this dependency and some shading issue. Or it could be the dependency isn't working as expected on OpenJDK 17 so it is a problem upstream.

@mch2 Can you try out the steps using OpenJDK 17 and see if you can replicate?

@nknize Yes, seeing it for OpenJDK 17 as is, upgrading asm-analysis to 9.2, and downgrading all asm in painless to 7.2 (did not try downgrading all asm dependencies in OpenSearch to 7.2 though).

nknize commented 2 years ago

did not try downgrading all asm dependencies in OpenSearch to 7.2 though

I'd give this a shot. I did it and the tests pass but that's not saying anything since I can't reproduce w/ Oracle JDK.

Mixed dependency versions is always a problem and the only changes since the last release here are the dependency versions and jdk upgrade. If you can downgrade the dependency to 7.2 across the board and it passes for you in jdk 17 then I'd be fine reverting the upgrade for 2.0 and investigating the problem further for a future 2.x release.

There's no urgency for us to upgrade.

mch2 commented 2 years ago

@mch2 Can you try out the steps using OpenJDK 17 and see if you can replicate?

Trying this with 2.0 as is, first run after 13m failed with a timeout but no further output, might be flaky. Trying again with -i.

dbbaughe commented 2 years ago

Checked out the advisory in the thread @mch2 found.

https://alas.aws.amazon.com/announcements/2021-001.html

Tried disabling the hot patch service sudo systemctl stop log4j-cve-2021-44228-hotpatch Was able to get the tests to successfully pass.

I tried upgrading it instead

sudo yum update log4j-cve-2021-44228-hotpatch     
Loaded plugins: priorities
amzn2-amazon                                                                                                                                                                                                                                     
amzn2-core                                                                                                                                                                                                                                       
amzn2-kernel                                                                                                                                                                                                                                     
amzn2-mate                                                                                                                                                                                                                                       
cloud-dev-dsk                                                                                                                                                                                                                                    
dcv                                                                                                                                                                                                                                              
firefox                                                                                                                                                                                                                                          
863 packages excluded due to repository priority protections
No packages marked for update

But it still fails, worried this will be a ticking time bomb for anyone running on Amazon EC2 (or higher abstraction services) then with that fat jar for the Log4j patch being injected.

mch2 commented 2 years ago

But it still fails, worried this will be a ticking time bomb for anyone running on Amazon EC2 (or higher abstraction services) then with that fat jar for the Log4j patch being injected.

CI is still running jdk 14, that is why it hasn't failed there.

For context, issue - https://github.com/elastic/elasticsearch/issues/81915#issuecomment-1001763305.

mch2 commented 2 years ago

I'm able to repro the failure with AL2 + JDK17. I've tried downgrading ASM to 7.2 with no success. Looking at shadowing those dependencies.

dbbaughe commented 2 years ago

But it still fails, worried this will be a ticking time bomb for anyone running on Amazon EC2 (or higher abstraction services) then with that fat jar for the Log4j patch being injected.

CI is still running jdk 14, that is why it hasn't failed there.

For context, issue - elastic/elasticsearch#81915 (comment).

This seems like a bit of a concern and probably just a miss - did we open an issue to make sure we are running CI on the versions that we require our plugins to run on? I believe there has been work across all plugins to make sure their CI is running on the supported JDK versions.. and probably should have the expectation that upstream core is stable on those versions too.

dblock commented 2 years ago

CI for 2.0 should not/is not being run on 14, where do we see that? if we're talking gradle check, that's being fixed now in https://github.com/opensearch-project/OpenSearch/pull/2671

mch2 commented 2 years ago

CI for 2.0 should not/is not being run on 14, where do we see that? if we're talking gradle check, that's being fixed now in #2671

I checked directly on the jenkins job that runs gradle checks. It is not pulling in any script under version control.

I believe what has changed is the bundle build CI, which I would expect is now failing for this reason.

owaiskazi19 commented 2 years ago

CI for 2.0 should not/is not being run on 14, where do we see that? if we're talking gradle check, that's being fixed now in #2671

I checked directly on the jenkins job that runs gradle checks. It is not pulling in any script under version control.

I believe what has changed is the bundle build CI, which I would expect is now failing for this reason.

@mch2 the jenkinsfile will be used by public jenkins later. The current CI is still using JDK-14 and has the old setup as of now. More context here

mch2 commented 2 years ago

CI for 2.0 should not/is not being run on 14, where do we see that? if we're talking gradle check, that's being fixed now in #2671

I checked directly on the jenkins job that runs gradle checks. It is not pulling in any script under version control. I believe what has changed is the bundle build CI, which I would expect is now failing for this reason.

@mch2 the jenkinsfile will be used by public jenkins later. The current CI is still using JDK-14 and has the old setup as of now. More context here

I'm not sure its wise for us to have inconsistent versions between the build & check jobs. I'd suggest we either manually change the check job to also be 17 or revert to 11.

With that said, I am still working on resolving this issue.

owaiskazi19 commented 2 years ago

I'm not sure its wise for us to have inconsistent versions between the build & check jobs. I'd suggest we either manually change the check job to also be 17 or revert to 11.

Makes sense. @peterzhuamazon can we run gradle check on the current CI with JDK-17 once this issue gets fixed?

penghuo commented 2 years ago

+1. Same issue when query with any painless script. Env:

Solution sudo touch /etc/log4j-cve-2021-44228-hotpatch.kill

nknize commented 2 years ago

Solution sudo touch /etc/log4j-cve-2021-44228-hotpatch.kill

If that's the case, then this is the most straightforward solution I've ever seen :laughing:

mch2 commented 2 years ago

sudo touch /etc/log4j-cve-2021-44228-hotpatch.kill

This will disable the log4j patch on those hosts. As a workaround we could use shadow pr here. This works for me with jdk 17 on AL2.

aksingh-es commented 2 years ago

AD plugin integ test are also failing due to this issue as the nodes are not coming up healthy . sudo touch /etc/log4j-cve-2021-44228-hotpatch.kill did work.Adding OS node bootup logs below

[2022-03-31T19:30:45,793][INFO ][o.o.a.t.StopDetectorTransportAction] [integTest-0] models of detector 1Gl14X8BSIHcmFuBSdJK get deleted [2022-03-31T19:30:45,794][INFO ][o.o.a.r.h.IndexAnomalyDetectorJobActionHandler] [integTest-0] AD model deleted successfully for detector 1Gl14X8BSIHcmFuBSdJK [2022-03-31T19:30:45,822][INFO ][o.o.a.m.CheckpointDao ] [integTest-0] 1 checkpoints docs get deleted [2022-03-31T19:30:45,849][WARN ][stderr ] [integTest-0] ANTLR Tool version 4.5.3 used for code generation does not match the current runtime version 4.9.3 [2022-03-31T19:30:45,879][WARN ][stderr ] [integTest-0] ANTLR Tool version 4.5.3 used for code generation does not match the current runtime version 4.9.3 [2022-03-31T19:30:45,943][ERROR][o.o.b.OpenSearchUncaughtExceptionHandler] [integTest-0] fatal error in thread [opensearch[integTest-0][generic][T#3]], exiting java.lang.IllegalAccessError: failed to access class org.objectweb.asm.Edge from class org.objectweb.asm.MethodWriter (org.objectweb.asm.Edge is in unnamed module of loader 'app'; org.objectweb.asm.MethodWriter is in unnamed module of loader java.net.FactoryURLClassLoader @126be319) at org.objectweb.asm.MethodWriter.addSuccessorToCurrentBasicBlock(MethodWriter.java:1782) ~[Log4jHotPatchFat.jar:?]

mch2 commented 2 years ago

An update here.

The root problem is the same conflict of ASM classes with jdk17. I think this is related to jdk17 blocking illegal access to runtime internals with JEP 403. The hotpatch is bringing in its version of these classes that are getting loaded by the system classloader and conflicting with the versions included by the painless module.

I thought maybe there was a difference in the jar contents between what is distributed in the gem, so I compared it to a locally built jar from https://github.com/corretto/hotpatch-for-apache-log4j2. There does look like a difference in the namespace of these dependencies and I'd expect this is related.

This is the hotpatch contents. The namespace is still the same org/objectweb/...

[~]$ unzip /usr/share/log4j-cve-2021-44228-hotpatch/jdk17/Log4jHotPatchFat.jar -d hotpatchjdk17
Archive:  /usr/share/log4j-cve-2021-44228-hotpatch/jdk17/Log4jHotPatchFat.jar
  inflating: hotpatchjdk17/META-INF/MANIFEST.MF
  inflating: hotpatchjdk17/Log4jHotPatch17$1.class
  inflating: hotpatchjdk17/Log4jHotPatch17$MethodInstrumentorClassVisitor.class
  inflating: hotpatchjdk17/Log4jHotPatch17$MethodInstrumentorMethodVisitor.class
  inflating: hotpatchjdk17/Log4jHotPatch17.class
   creating: hotpatchjdk17/org/
   creating: hotpatchjdk17/org/objectweb/
   creating: hotpatchjdk17/org/objectweb/asm/
  inflating: hotpatchjdk17/org/objectweb/asm/AnnotationVisitor.class
  inflating: hotpatchjdk17/org/objectweb/asm/AnnotationWriter.class
  inflating: hotpatchjdk17/org/objectweb/asm/Attribute$Set.class
  inflating: hotpatchjdk17/org/objectweb/asm/Attribute.class
  inflating: hotpatchjdk17/org/objectweb/asm/ByteVector.class
  inflating: hotpatchjdk17/org/objectweb/asm/ClassReader.class
  inflating: hotpatchjdk17/org/objectweb/asm/ClassTooLargeException.class
  inflating: hotpatchjdk17/org/objectweb/asm/ClassVisitor.class
  inflating: hotpatchjdk17/org/objectweb/asm/ClassWriter.class
  inflating: hotpatchjdk17/org/objectweb/asm/ConstantDynamic.class
  inflating: hotpatchjdk17/org/objectweb/asm/Constants.class
  inflating: hotpatchjdk17/org/objectweb/asm/Context.class
  inflating: hotpatchjdk17/org/objectweb/asm/CurrentFrame.class
  inflating: hotpatchjdk17/org/objectweb/asm/Edge.class
  inflating: hotpatchjdk17/org/objectweb/asm/FieldVisitor.class
  inflating: hotpatchjdk17/org/objectweb/asm/FieldWriter.class
  inflating: hotpatchjdk17/org/objectweb/asm/Frame.class
  inflating: hotpatchjdk17/org/objectweb/asm/Handle.class
  inflating: hotpatchjdk17/org/objectweb/asm/Handler.class
  inflating: hotpatchjdk17/org/objectweb/asm/Label.class
  inflating: hotpatchjdk17/org/objectweb/asm/MethodTooLargeException.class
  inflating: hotpatchjdk17/org/objectweb/asm/MethodVisitor.class
  inflating: hotpatchjdk17/org/objectweb/asm/MethodWriter.class
  inflating: hotpatchjdk17/org/objectweb/asm/ModuleVisitor.class
  inflating: hotpatchjdk17/org/objectweb/asm/ModuleWriter.class
  inflating: hotpatchjdk17/org/objectweb/asm/Opcodes.class
  inflating: hotpatchjdk17/org/objectweb/asm/RecordComponentVisitor.class
  inflating: hotpatchjdk17/org/objectweb/asm/RecordComponentWriter.class
  inflating: hotpatchjdk17/org/objectweb/asm/Symbol.class
  inflating: hotpatchjdk17/org/objectweb/asm/SymbolTable$Entry.class
  inflating: hotpatchjdk17/org/objectweb/asm/SymbolTable.class
  inflating: hotpatchjdk17/org/objectweb/asm/Type.class
  inflating: hotpatchjdk17/org/objectweb/asm/TypePath.class
  inflating: hotpatchjdk17/org/objectweb/asm/TypeReference.class
   creating: hotpatchjdk17/org/objectweb/asm/signature/
  inflating: hotpatchjdk17/org/objectweb/asm/signature/SignatureReader.class
  inflating: hotpatchjdk17/org/objectweb/asm/signature/SignatureVisitor.class
  inflating: hotpatchjdk17/org/objectweb/asm/signature/SignatureWriter.class

Whereas cloning https://github.com/corretto/hotpatch-for-apache-log4j2 and unpacking...

(22-03-31 20:27:59) <0> [/local/home/handalm/workplace/hotpatch-for-apache-log4j2/build/libs]
[~]$ unzip Log4jHotPatch.jar -d localbuild-hotpatch
Archive:  Log4jHotPatch.jar
   creating: localbuild-hotpatch/META-INF/
  inflating: localbuild-hotpatch/META-INF/MANIFEST.MF
  inflating: localbuild-hotpatch/Log4jHotPatch$MethodInstrumentorClassVisitor.class
  inflating: localbuild-hotpatch/Log4jHotPatch$1.class
  inflating: localbuild-hotpatch/Log4jHotPatch$MethodInstrumentorMethodVisitor.class
  inflating: localbuild-hotpatch/Log4jHotPatch.class
  inflating: localbuild-hotpatch/Log4jHotPatch17.class
   creating: localbuild-hotpatch/com/
   creating: localbuild-hotpatch/com/amazon/
   creating: localbuild-hotpatch/com/amazon/corretto/
   creating: localbuild-hotpatch/com/amazon/corretto/org/
   creating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/
   creating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/AnnotationVisitor.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/AnnotationWriter.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/Attribute$Set.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/Attribute.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/ByteVector.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/ClassReader.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/ClassTooLargeException.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/ClassVisitor.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/ClassWriter.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/ConstantDynamic.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/Constants.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/Context.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/CurrentFrame.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/Edge.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/FieldVisitor.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/FieldWriter.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/Frame.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/Handle.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/Handler.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/Label.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/MethodTooLargeException.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/MethodVisitor.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/MethodWriter.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/ModuleVisitor.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/ModuleWriter.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/Opcodes.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/RecordComponentVisitor.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/RecordComponentWriter.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/Symbol.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/SymbolTable$Entry.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/SymbolTable.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/Type.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/TypePath.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/TypeReference.class
   creating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/signature/
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/signature/SignatureReader.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/signature/SignatureVisitor.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/signature/SignatureWriter.class

The fix I put in #2681 works because it creates a fat jar that rewrites the path of the objectweb dependencies but I don't think that should be required.

mch2 commented 2 years ago

Ok looks like the jar in the hotpatch is the issue. I'm able to reproduce using the static agent attached to the process with ./gradle run. Not sure why adding the agent to gradle.properties didn't load properly.

Repro steps:

  1. Disable the hotpatch sudo systemctl stop log4j-cve-2021-44228-hotpatch
  2. Start opensearch with ./gradlew run.
  3. Execute a few curl requests with painless, this will pass:
    
    curl -X PUT "localhost:9200/test-index/_doc/1?pretty" -H 'Content-Type: application/json' -d'
    {
    "foo": 1
    }
    '

curl -X GET "localhost:9200/test-index/_search?pretty" -H 'Content-Type: application/json' -d' { "script_fields": { "my_doubled_field": { "script": { "source": "doc[\u0027foo\u0027].value * params[\u0027multiplier\u0027]", "params": { "multiplier": 2 } } } } } '


4. restart opensearch and fetch its pid.
5. Start the hotpatch agent. `java -jar /usr/share/log4j-cve-2021-44228-hotpatch/jdk17/Log4jHotPatchFat.jar <pid>`
6. Execute step 3 again, the request should fail with the expected error log to stdout.
7. Stop the service.
8. Clone https://github.com/corretto/hotpatch-for-apache-log4j2 and execute `./gradlew build` to build the jar.
9. Restart the service, this time with the newly built jar. `java -jar ~/path/to/repo/build/libs/Log4jHotPatch.jar <pid>`
10. Execute step 3 again, the requests should succeed.
mch2 commented 2 years ago

So to move this fwd I see two options. We either 1 - disable the hotpatch on CI & warn users of this issue with jdk17 + the hotpatch or 2 - move fwd with shadowing lang-painless's version of these jars ourselves. While we can't control whats running on user's hosts, I see 2 as a safer option, thoughts?

dblock commented 2 years ago

So to move this fwd I see two options. We either 1 - disable the hotpatch on CI & warn users of this issue with jdk17 + the hotpatch or 2 - move fwd with shadowing lang-painless's version of these jars ourselves. While we can't control whats running on user's hosts, I see 2 as a safer option, thoughts?

First, thanks for the thorough debugging and repro, solid work.

I think we can move forward with (2) now as a workaround. Let's leave the bug open or open a new one to undo the shadowing, and treat it as a bug in the hot patch. The latter shouldn't have this kind of effect on any application IMHO.

mch2 commented 2 years ago

To be clear, this is the commit missing from the hotpatch. This previous fix is also missing.

Am working on an update to our fix. Its also worth calling out this will increase the size of the lang-painless jar from ~700kb to ~1.1MB.

I've also not been able to repro the bug with lang-expression that also includes these dependencies. I think until we can we shouldn't make any change there.

mch2 commented 2 years ago

Closing with fix merged. Please reopen if this reappears.

JoeJoe1989 commented 1 year ago

Hi, my projects use both asm (7.2) and asm-util (7.1). In order to upgrade my project to Open JDK 17 LTS, I need to update asm to 9.1 which supports JDK 17. However, the build failed due to:

"Unsupported api 589824" error for asm (9.1) and asm-util (9.1)

If I downgrade asm-util to 7.1, the build succeeded. asm (9.1) and asm-util (7.1)

Not sure if you have any idea. Thanks.