timja / jenkins-gh-issues-poc-06-18

0 stars 0 forks source link

[JENKINS-56374] JUnitResultArchiver parsing XML in sort()->compare(), killing Jenkins #4298

Open timja opened 5 years ago

timja commented 5 years ago

Jenkins 2.150.1

Jenkins dies with the JVM going into perpetual ergonomics GC. The JVM has a 4Gb heap. The biggest XML file is 180Mb. So that's not a fundamental limitation. Something dumb is going on.

I think it might be repeatedly parsing the same XML file again and again, because when I dump jstack, I've seen a lot of different XML parsers there. This MIGHT be triggered by a @Parameterized junit test with 50,000 cases. In any case, this has made Jenkins totally unusable.

It also dies when we have a single JUnit case with a 1Gb output XML, but we have worked around that. But we cannot reduce the size of our test suite just to keep Jenkins happy.

Reported critical as this has stopped us from using Jenkins. Tempted to report blocker.

Here is some stack.

```
at com.thoughtworks.xstream.mapper.AnnotationMapper.processAnnotations(AnnotationMapper.java:180)
at com.thoughtworks.xstream.mapper.AnnotationMapper.defaultImplementationOf(AnnotationMapper.java:141)
at hudson.util.xstream.MapperDelegate.defaultImplementationOf(MapperDelegate.java:59)
at com.thoughtworks.xstream.mapper.MapperWrapper.defaultImplementationOf(MapperWrapper.java:46)
at hudson.util.RobustReflectionConverter.determineType(RobustReflectionConverter.java:475)
at hudson.util.RobustReflectionConverter.doUnmarshal(RobustReflectionConverter.java:327)
at hudson.util.RobustReflectionConverter.unmarshal(RobustReflectionConverter.java:270)
at com.thoughtworks.xstream.core.TreeUnmarshaller.convert(TreeUnmarshaller.java:72)
at com.thoughtworks.xstream.core.AbstractReferenceUnmarshaller.convert(AbstractReferenceUnmarshaller.java:65)
at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:66)
at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:50)
at com.thoughtworks.xstream.converters.collections.AbstractCollectionConverter.readItem(AbstractCollectionConverter.java:71)
at hudson.util.RobustCollectionConverter.populateCollection(RobustCollectionConverter.java:85)
at com.thoughtworks.xstream.converters.collections.CollectionConverter.unmarshal(CollectionConverter.java:80)
at hudson.util.RobustCollectionConverter.unmarshal(RobustCollectionConverter.java:76)
at com.thoughtworks.xstream.core.TreeUnmarshaller.convert(TreeUnmarshaller.java:72)
at com.thoughtworks.xstream.core.AbstractReferenceUnmarshaller.convert(AbstractReferenceUnmarshaller.java:65)
at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:66)
at hudson.util.RobustReflectionConverter.unmarshalField(RobustReflectionConverter.java:393)
at hudson.util.RobustReflectionConverter.doUnmarshal(RobustReflectionConverter.java:331)
at hudson.util.RobustReflectionConverter.unmarshal(RobustReflectionConverter.java:270)
at com.thoughtworks.xstream.core.TreeUnmarshaller.convert(TreeUnmarshaller.java:72)
at com.thoughtworks.xstream.core.AbstractReferenceUnmarshaller.convert(AbstractReferenceUnmarshaller.java:65)
at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:66)
at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:50)
at com.thoughtworks.xstream.converters.collections.AbstractCollectionConverter.readItem(AbstractCollectionConverter.java:71)
at hudson.util.RobustCollectionConverter.populateCollection(RobustCollectionConverter.java:85)
at com.thoughtworks.xstream.converters.collections.CollectionConverter.unmarshal(CollectionConverter.java:80)
at hudson.util.RobustCollectionConverter.unmarshal(RobustCollectionConverter.java:76)
at com.thoughtworks.xstream.core.TreeUnmarshaller.convert(TreeUnmarshaller.java:72)
at com.thoughtworks.xstream.core.AbstractReferenceUnmarshaller.convert(AbstractReferenceUnmarshaller.java:65)
at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:66)
at hudson.util.RobustReflectionConverter.unmarshalField(RobustReflectionConverter.java:393)
at hudson.util.RobustReflectionConverter.doUnmarshal(RobustReflectionConverter.java:331)
at hudson.util.RobustReflectionConverter.unmarshal(RobustReflectionConverter.java:270)
at com.thoughtworks.xstream.core.TreeUnmarshaller.convert(TreeUnmarshaller.java:72)
at com.thoughtworks.xstream.core.AbstractReferenceUnmarshaller.convert(AbstractReferenceUnmarshaller.java:65)
at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:66)
at com.thoughtworks.xstream.core.TreeUnmarshaller.convertAnother(TreeUnmarshaller.java:50)
at com.thoughtworks.xstream.core.TreeUnmarshaller.start(TreeUnmarshaller.java:134)
at com.thoughtworks.xstream.core.AbstractTreeMarshallingStrategy.unmarshal(AbstractTreeMarshallingStrategy.java:32)
at com.thoughtworks.xstream.XStream.unmarshal(XStream.java:1189)
at hudson.util.XStream2.unmarshal(XStream2.java:161)
at hudson.util.XStream2.unmarshal(XStream2.java:132)
at com.thoughtworks.xstream.XStream.unmarshal(XStream.java:1173)
at com.thoughtworks.xstream.XStream.fromXML(XStream.java:1053)
at hudson.XmlFile.read(XmlFile.java:147)
at hudson.tasks.junit.TestResultAction.load(TestResultAction.java:208)
at hudson.tasks.junit.TestResultAction.getResult(TestResultAction.java:145)


Originally reported by shevek, imported from: JUnitResultArchiver parsing XML in sort()->compare(), killing Jenkins
  • assignee: menonvarun
  • status: Open
  • priority: Critical
  • resolution: Unresolved
  • imported: 2022/01/10
timja commented 5 years ago

shevek:

Related stack:

```

Mar 03, 2019 10:00:07 PM hudson.tasks.junit.TestResultAction load
WARNING: Failed to load /home/jenkins/jobs/XXX-pr/builds/6739/junitResult.xml
java.io.IOException: Unable to read /home/jenkins/jobs/XXX-pr/builds/6739/junitResult.xml
at hudson.XmlFile.read(XmlFile.java:149)
at hudson.tasks.junit.TestResultAction.load(TestResultAction.java:208)
at hudson.tasks.junit.TestResultAction.getResult(TestResultAction.java:138)
at hudson.tasks.junit.TestResultAction.getResult(TestResultAction.java:63)
at hudson.tasks.test.AbstractTestResultAction.findCorrespondingResult(AbstractTestResultAction.java:252)
at hudson.tasks.test.TestResult.getPreviousResult(TestResult.java:146)
at hudson.tasks.junit.SuiteResult.getPreviousResult(SuiteResult.java:416)
at hudson.tasks.junit.CaseResult.getPreviousResult(CaseResult.java:479)
at hudson.tasks.junit.CaseResult.freeze(CaseResult.java:641)
at hudson.tasks.junit.SuiteResult.freeze(SuiteResult.java:462)
at hudson.tasks.junit.TestResult.freeze(TestResult.java:748)
at hudson.tasks.junit.TestResultAction.load(TestResultAction.java:213)
at hudson.tasks.junit.TestResultAction.getResult(TestResultAction.java:145)
at hudson.tasks.junit.TestResultAction.getResult(TestResultAction.java:63)
at hudson.tasks.test.AbstractTestResultAction.findCorrespondingResult(AbstractTestResultAction.java:252)
at hudson.tasks.test.TestResult.getPreviousResult(TestResult.java:146)
at hudson.tasks.junit.SuiteResult.getPreviousResult(SuiteResult.java:416)
at hudson.tasks.junit.CaseResult.getPreviousResult(CaseResult.java:479)
at hudson.tasks.junit.CaseResult.freeze(CaseResult.java:641)
at hudson.tasks.junit.SuiteResult.freeze(SuiteResult.java:462)
at hudson.tasks.junit.TestResult.freeze(TestResult.java:748)
at hudson.tasks.junit.TestResultAction.load(TestResultAction.java:213)
at hudson.tasks.junit.TestResultAction.getResult(TestResultAction.java:138)
at hudson.tasks.junit.TestResultAction.getResult(TestResultAction.java:63)
at hudson.tasks.test.AbstractTestResultAction.findCorrespondingResult(AbstractTestResultAction.java:252)
at hudson.tasks.test.TestResult.getPreviousResult(TestResult.java:146)
at hudson.tasks.junit.SuiteResult.getPreviousResult(SuiteResult.java:416)
at hudson.tasks.junit.CaseResult.getPreviousResult(CaseResult.java:479)
at hudson.tasks.junit.CaseResult.freeze(CaseResult.java:641)
at hudson.tasks.junit.SuiteResult.freeze(SuiteResult.java:462)
at hudson.tasks.junit.TestResult.freeze(TestResult.java:748)
at hudson.tasks.junit.TestResultAction.load(TestResultAction.java:213)
at hudson.tasks.junit.TestResultAction.getResult(TestResultAction.java:138)
at hudson.tasks.junit.TestResultAction.getResult(TestResultAction.java:63)
at hudson.tasks.test.AbstractTestResultAction.findCorrespondingResult(AbstractTestResultAction.java:252)
at hudson.tasks.test.TestResult.getPreviousResult(TestResult.java:146)
at hudson.tasks.junit.SuiteResult.getPreviousResult(SuiteResult.java:416)
at hudson.tasks.junit.CaseResult.getPreviousResult(CaseResult.java:479)
at hudson.tasks.junit.CaseResult.freeze(CaseResult.java:641)
at hudson.tasks.junit.SuiteResult.freeze(SuiteResult.java:462)
at hudson.tasks.junit.TestResult.freeze(TestResult.java:748)
at hudson.tasks.junit.TestResultAction.load(TestResultAction.java:213)
at hudson.tasks.junit.TestResultAction.getResult(TestResultAction.java:138)
at hudson.tasks.junit.TestResultAction.getResult(TestResultAction.java:63)
at hudson.tasks.test.AbstractTestResultAction.findCorrespondingResult(AbstractTestResultAction.java:252)
at hudson.tasks.test.TestResult.getPreviousResult(TestResult.java:146)
at hudson.tasks.junit.SuiteResult.getPreviousResult(SuiteResult.java:416)
at hudson.tasks.junit.CaseResult.getPreviousResult(CaseResult.java:479)
at hudson.tasks.junit.CaseResult.freeze(CaseResult.java:641)
at hudson.tasks.junit.SuiteResult.freeze(SuiteResult.java:462)
at hudson.tasks.junit.TestResult.freeze(TestResult.java:748)
at hudson.tasks.junit.TestResultAction.load(TestResultAction.java:213)
at hudson.tasks.junit.TestResultAction.getResult(TestResultAction.java:138)
at hudson.tasks.junit.TestResultAction.getResult(TestResultAction.java:63)
at hudson.tasks.test.AbstractTestResultAction.findCorrespondingResult(AbstractTestResultAction.java:252)
at hudson.tasks.test.TestResult.getPreviousResult(TestResult.java:146)
at hudson.tasks.junit.SuiteResult.getPreviousResult(SuiteResult.java:416)
at hudson.tasks.junit.CaseResult.getPreviousResult(CaseResult.java:479)
at hudson.tasks.junit.CaseResult.freeze(CaseResult.java:641)
at hudson.tasks.junit.SuiteResult.freeze(SuiteResult.java:462)
at hudson.tasks.junit.TestResult.freeze(TestResult.java:748)
at hudson.tasks.junit.TestResultAction.load(TestResultAction.java:213)
at hudson.tasks.junit.TestResultAction.getResult(TestResultAction.java:138)
at hudson.tasks.junit.TestResultAction.getResult(TestResultAction.java:63)
at hudson.tasks.test.AbstractTestResultAction.findCorrespondingResult(AbstractTestResultAction.java:252)

...

...

...

at hudson.tasks.test.AbstractTestResultAction.findCorrespondingResult(AbstractTestResultAction.java:252)
at hudson.tasks.test.TestResult.getPreviousResult(TestResult.java:146)
at hudson.tasks.junit.SuiteResult.getPreviousResult(SuiteResult.java:416)
at hudson.tasks.junit.CaseResult.getPreviousResult(CaseResult.java:479)
at hudson.tasks.junit.CaseResult.freeze(CaseResult.java:641)
at hudson.tasks.junit.SuiteResult.freeze(SuiteResult.java:462)
at hudson.tasks.junit.TestResult.freeze(TestResult.java:748)
at hudson.tasks.junit.TestResultAction.load(TestResultAction.java:213)
at hudson.tasks.junit.TestResultAction.getResult(TestResultAction.java:138)
at hudson.tasks.junit.TestResultAction.getResult(TestResultAction.java:63)
at hudson.tasks.test.AbstractTestResultAction.findCorrespondingResult(AbstractTestResultAction.java:252)
at hudson.tasks.test.TestResult.getPreviousResult(TestResult.java:146)
at hudson.tasks.junit.SuiteResult.getPreviousResult(SuiteResult.java:416)
at hudson.tasks.junit.CaseResult.getPreviousResult(CaseResult.java:479)
at hudson.tasks.junit.CaseResult.freeze(CaseResult.java:641)
at hudson.tasks.junit.SuiteResult.freeze(SuiteResult.java:462)
at hudson.tasks.junit.TestResult.freeze(TestResult.java:748)
at hudson.tasks.junit.TestResultAction.load(TestResultAction.java:213)
at hudson.tasks.junit.TestResultAction.getResult(TestResultAction.java:145)
at hudson.tasks.junit.TestResultAction.getResult(TestResultAction.java:63)
at hudson.tasks.test.AbstractTestResultAction.findCorrespondingResult(AbstractTestResultAction.java:252)
at hudson.tasks.test.TestResult.getPreviousResult(TestResult.java:146)
at hudson.tasks.junit.SuiteResult.getPreviousResult(SuiteResult.java:416)
at hudson.tasks.junit.CaseResult.getPreviousResult(CaseResult.java:479)
at hudson.tasks.junit.CaseResult.getFailedSince(CaseResult.java:406)
at hudson.tasks.junit.CaseResult.getAge(CaseResult.java:434)
at hudson.tasks.junit.CaseResult$1.compare(CaseResult.java:754)
at hudson.tasks.junit.CaseResult$1.compare(CaseResult.java:752)
at java.util.TimSort.countRunAndMakeAscending(TimSort.java:355)
at java.util.TimSort.sort(TimSort.java:234)
at java.util.Arrays.sort(Arrays.java:1512)
at java.util.ArrayList.sort(ArrayList.java:1462)
at java.util.Collections.sort(Collections.java:175)
at hudson.tasks.junit.TestResult.freeze(TestResult.java:780)
at hudson.tasks.junit.TestResultAction.setResult(TestResultAction.java:108)
at hudson.tasks.junit.TestResultAction.(TestResultAction.java:85)
at hudson.tasks.junit.JUnitResultArchiver.parseAndAttach(JUnitResultArchiver.java:175)
at hudson.tasks.junit.JUnitResultArchiver.perform(JUnitResultArchiver.java:154)
at hudson.tasks.BuildStepCompatibilityLayer.perform(BuildStepCompatibilityLayer.java:81)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744)
at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:690)
at hudson.model.Build$BuildExecution.post2(Build.java:186)
at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:635)
at hudson.model.Run.execute(Run.java:1835)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:429)
Caused by: java.lang.OutOfMemoryError: Java heap space

```

timja commented 5 years ago

shevek:

Note that this is the ONLY executor running, we don't run anything else in-process, and the XML file it's failing to load is only 100Mb.

 

Wait... are we parsing XML in the compare() routine of a sort-object? What the $BADWORD...? WAT?

timja commented 5 years ago

shevek:

jmap -histo:live output:

 

num #instances #bytes class name
----------------------------------------------
1: 2678703 2691823496 [C
2: 2678264 64278336 java.lang.String
3: 811096 58398912 hudson.tasks.junit.CaseResult
4: 319187 10213984 java.util.HashMap$Node
5: 192771 8994440 [Ljava.lang.String;
6: 42153 7186960 [Ljava.lang.Object;
7: 187092 4490208 com.thoughtworks.xstream.io.path.Path
8: 12333 2778840 [Ljava.util.HashMap$Node;
9: 18705 2075528 java.lang.Class
10: 61023 1952736 java.util.concurrent.ConcurrentHashMap$Node

 

Where did we get 811,096 CaseResults in the heap from? They're all live objects, too. This looks like a serious memory leak.

timja commented 5 years ago

shevek:

After the GC-storm, we get this:

num #instances #bytes class name
----------------------------------------------
1: 208769 27421360 [C
2: 208315 4999560 java.lang.String
3: 81032 2593024 java.util.HashMap$Node
4: 37398 2486496 [Ljava.lang.Object;
5: 9331 2316560 [B
6: 18857 2092720 java.lang.Class
7: 62775 2008800 java.util.concurrent.ConcurrentHashMap$Node
8: 20596 1812448 java.lang.reflect.Method
9: 12253 1242400 [Ljava.util.HashMap$Node;
10: 11491 947256 [Ljava.util.WeakHashMap$Entry;

 

which is a lot saner, and shows that all the data held in memory was related to the memory inefficiency/leak in the reporter, and is not part of normal operation.

timja commented 5 years ago

shevek:

Attached visualvm dump of retained-object sizes showing the suites in memory.

timja commented 4 years ago

nightman68:

I have the same issue with the latest LTE release. My Jetty is forced to restart frequently because of out of heap memory.