Open asfimport opened 10 years ago
Bob Atkey (migrated from Bugzilla): I wrote:
We are also planning to use this finding as an example of Android-related concurrency mistakes in an article about ThreadSafe.
Obviously, "Android-related" was a mistake here. I meant to write "Java-related". Sorry about that.
@pmouawad (migrated from Bugzilla): Thanks for report, I was wondering why your tool does not also report line 456: return new PropertyIteratorImpl(propMap.values());
Did you report all findings of your tool or only some examples ?
Bob Atkey (migrated from Bugzilla):
Thanks for report, I was wondering why your tool does not also report line 456: return new PropertyIteratorImpl(propMap.values());
Actually, our tool doesn't report the instance on line 456 because it doesn't look into the implementation of PropertyIteratorImpl while analysing the AbstractTestElement class. Therefore, it doesn't realise that everything that is passed in will have .iterator() called on it. This is certainly something that we will look at adding to ThreadSafe in the future. Thanks!
Did you report all findings of your tool or only some examples ?
I only reported two examples of findings that I thought looked interesting. If I get more time soon I will report the rest of the interesting looking reports that ThreadSafe gives.
@ham1 (migrated from Bugzilla): Created PR for the example identified in this issue: https://github.com/apache/jmeter/pull/260
Are there any more that the tool found (in the latest code) which should be addressed?
@vlsi (migrated from Bugzilla): As far as I can see, the syncrhonizedMap was added in attempt to fix https://github.com/apache/jmeter/issues/1023 The latest reply suggests that the synchronization did not help.
So I do not see much reason to complicate code. I don't think it would hurt, yet I see no reason how it would help.
@pmouawad (migrated from Bugzilla): (In reply to Vladimir Sitnikov from comment 5)
As far as I can see, the syncrhonizedMap was added in attempt to fix https://github.com/apache/jmeter/issues/1023 The latest reply suggests that the synchronization did not help.
So I do not see much reason to complicate code. I don't think it would hurt, yet I see no reason how it would help.
Thanks Felix for digging, interesting indeed.
A question for you, I have a doubt. In my understanding, calling synchronized in monothreaded call has a cost: http://stackoverflow.com/questions/973518/are-synchronized-methods-slower-in-single-threaded-applications
Do you confirm this from benchmarks you did ?
@ham1 (migrated from Bugzilla): (In reply to Vladimir Sitnikov from comment 5)
As far as I can see, the syncrhonizedMap was added in attempt to fix https://github.com/apache/jmeter/issues/1023 The latest reply suggests that the synchronization did not help.
So I do not see much reason to complicate code. I don't think it would hurt, yet I see no reason how it would help.
It does say the problem still exists in the comment yet the issue was set to resolved.
Maybe we should remove the synchronizedMap as we are accessing it in an unsafe way anyway - or just comment the code and close this bz.
TaoLu (migrated from Bugzilla): public void recoverRunningVersion() { Iterator iter = this.propMap.entrySet().iterator();
while(iter.hasNext()) {
Entry<String, JMeterProperty> entry = (Entry)iter.next();
JMeterProperty prop = (JMeterProperty)entry.getValue();
if (this.isTemporary(prop)) {
iter.remove();
this.clearTemporary(prop);
} else {
prop.recoverRunningVersion(this);
}
}
this.emptyTemporary();
}
Created attachment nextNull1.jpg: Jmeter Test Error When Itr next and remove
TaoLu (migrated from Bugzilla): For the 10 slave Linux machines used in the test environment, each 10 thread is 100 concurrent, and most of them report an error.ConcurrentModificationException, The sampler is a downloaded soap sampler. Most of the machines report this error. The script includes two CSV config and some listeners. I don't know whether the component is used incorrectly or where there is a problem. I haven't found the problem for a day, Some people say that problem is 'recycle on EOF' set by CSV config is false. Is it the problem caused by soap sampler, but there is no error when running locally in windows os
@vlsi (migrated from Bugzilla): @TaoLu, could you please share the script that reproduces the issue?
@FSchumacher (migrated from Bugzilla): I think we have to use a lock around the clean-up of the temporary properties like Graham did.
With the attached patch I can reproduce an overlap in the recoverRunningVersion method between more than one thread.
The exceptions (with are for example: 2022-03-08 21:23:14,089 WARN o.a.j.t.AbstractTestElement: Recover in different threads: [Thread[Thread-33,6,main], Thread[Thread-19,6,main]]; Element: org.apache.jmeter.sampler.DebugSampler@7ea357f0 java.lang.RuntimeException: Thread mismatch
-------at org.apache.jmeter.testelement.AbstractTestElement.recoverRunningVersion(AbstractTestElement.java:511) ~[ApacheJMeter_core.jar:5.5-SNAPSHOT] -------at org.apache.jmeter.threads.SamplePackage.recoverRunningVersion(SamplePackage.java:126) ~[ApacheJMeter_core.jar:5.5-SNAPSHOT] -------at org.apache.jmeter.threads.TestCompiler.done(TestCompiler.java:123) ~[ApacheJMeter_core.jar:5.5-SNAPSHOT] -------at org.apache.jmeter.threads.JMeterThread.executeSamplePackage(JMeterThread.java:593) ~[ApacheJMeter_core.jar:5.5-SNAPSHOT] -------at org.apache.jmeter.threads.JMeterThread.processSampler(JMeterThread.java:501) ~[ApacheJMeter_core.jar:5.5-SNAPSHOT] -------at org.apache.jmeter.threads.JMeterThread.run(JMeterThread.java:268) ~[ApacheJMeter_core.jar:5.5-SNAPSHOT] -------at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_201] 2022-03-08 21:23:14,579 WARN o.a.j.t.AbstractTestElement: Recover in different threads: [Thread[Thread-34,6,main], Thread[Thread-20,6,main]]; Element: org.apache.jmeter.sampler.DebugSampler@521c345a java.lang.RuntimeException: Thread mismatch -------at org.apache.jmeter.testelement.AbstractTestElement.recoverRunningVersion(AbstractTestElement.java:511) ~[ApacheJMeter_core.jar:5.5-SNAPSHOT] -------at org.apache.jmeter.threads.SamplePackage.recoverRunningVersion(SamplePackage.java:126) ~[ApacheJMeter_core.jar:5.5-SNAPSHOT] -------at org.apache.jmeter.threads.TestCompiler.done(TestCompiler.java:123) ~[ApacheJMeter_core.jar:5.5-SNAPSHOT] -------at org.apache.jmeter.threads.JMeterThread.executeSamplePackage(JMeterThread.java:593) ~[ApacheJMeter_core.jar:5.5-SNAPSHOT] -------at org.apache.jmeter.threads.JMeterThread.processSampler(JMeterThread.java:501) ~[ApacheJMeter_core.jar:5.5-SNAPSHOT] -------at org.apache.jmeter.threads.JMeterThread.run(JMeterThread.java:268) ~[ApacheJMeter_core.jar:5.5-SNAPSHOT] -------at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_201] 2022-03-08 21:23:14,590 WARN o.a.j.t.AbstractTestElement: Recover in different threads: [Thread[Thread-35,6,main], Thread[Thread-30,6,main], Thread[Thread-21,6,main], Thread[Thread-16,6,main]]; Element: org.apache.jmeter.control.LoopController@4b4d66fb java.lang.RuntimeException: Thread mismatch -------at org.apache.jmeter.testelement.AbstractTestElement.recoverRunningVersion(AbstractTestElement.java:511) ~[ApacheJMeter_core.jar:5.5-SNAPSHOT] -------at org.apache.jmeter.control.LoopController.reInitialize(LoopController.java:203) ~[ApacheJMeter_core.jar:5.5-SNAPSHOT] -------at org.apache.jmeter.control.LoopController.nextIsNull(LoopController.java:157) ~[ApacheJMeter_core.jar:5.5-SNAPSHOT] -------at org.apache.jmeter.control.GenericController.next(GenericController.java:170) ~[ApacheJMeter_core.jar:5.5-SNAPSHOT] -------at org.apache.jmeter.control.LoopController.next(LoopController.java:134) ~[ApacheJMeter_core.jar:5.5-SNAPSHOT] -------at org.apache.jmeter.threads.AbstractThreadGroup.next(AbstractThreadGroup.java:99) ~[ApacheJMeter_core.jar:5.5-SNAPSHOT] -------at org.apache.jmeter.threads.JMeterThread.run(JMeterThread.java:303) ~[ApacheJMeter_core.jar:5.5-SNAPSHOT] -------at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_201]
Created attachment slow-recoverRunningVersion.diff: Slow down recoverRunningVersion to show problems
@FSchumacher (migrated from Bugzilla): Forget my last comment for now, as I forgot to remove the thread from the stack.
@FSchumacher (migrated from Bugzilla): Now a working version of the "slow" recoverRunningVersion method.
When the threads are removed at the end (in my simple test plan) one Test Element is getting flagged, only.
2022-03-08 21:55:40,771 WARN o.a.j.t.AbstractTestElement: Recover in different threads: [Thread[Thread-53,6,main], Thread[Thread-54,6,main]]; Element: org.apache.jmeter.reporters.ResultCollector@42f6fa63 java.lang.RuntimeException: Thread mismatch
-------at org.apache.jmeter.testelement.AbstractTestElement.recoverRunningVersion(AbstractTestElement.java:513) ~[ApacheJMeter_core.jar:5.5-SNAPSHOT] -------at org.apache.jmeter.threads.SamplePackage.recoverRunningVersion(SamplePackage.java:110) ~[ApacheJMeter_core.jar:5.5-SNAPSHOT] -------at org.apache.jmeter.threads.SamplePackage.recoverRunningVersion(SamplePackage.java:120) ~[ApacheJMeter_core.jar:5.5-SNAPSHOT] -------at org.apache.jmeter.threads.TestCompiler.done(TestCompiler.java:123) ~[ApacheJMeter_core.jar:5.5-SNAPSHOT] -------at org.apache.jmeter.threads.JMeterThread.executeSamplePackage(JMeterThread.java:593) ~[ApacheJMeter_core.jar:5.5-SNAPSHOT] -------at org.apache.jmeter.threads.JMeterThread.processSampler(JMeterThread.java:501) ~[ApacheJMeter_core.jar:5.5-SNAPSHOT] -------at org.apache.jmeter.threads.JMeterThread.run(JMeterThread.java:268) ~[ApacheJMeter_core.jar:5.5-SNAPSHOT] -------at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_201]
The code for ResultCollector explicitly mentions, that it has to be thread-safe (which recoverRunningVersion is not).
So maybe, we can make it for ResultCollector thread-safe instead of all? But on the other hand, if we obtain a lock for a single thread, it is not that expensive any more. So I think we could try to add it.
Created attachment slow-recoverRunningVersion.diff: Slow down recoverRunningVersion to show problems
@FSchumacher (migrated from Bugzilla): Created attachment slow-recoverRunningVersion.diff: sl
@FSchumacher (migrated from Bugzilla): Adding more debug output (size of the temporary properties), it does look like ResultCollector has no temporary properties and should not be the culprit.
So, the question is, which Element is it and where does the concurrent access comes from?
@TaoLu, are you using a third party plugin? What elements do you use apart from CSV Data Source? How big are your CSV sources, how many elements do they have?
Created attachment slow-recoverRunningVersion.diff: Slow down recoverRunningVersion to show problems
@FSchumacher (migrated from Bugzilla): For comparison I did a simple test plan (dummy sampler + csv data source) with 2000 threads. First time with additional locking and second time without. The numbers are not that different:
(with locking) $ ./bin/jmeter -t LAST -n Creating summariser <summary> Created the tree successfully using /home/felix/jmeter-tests/csv-thread-mismatch.jmx Starting standalone test @ Tue Mar 08 22:29:44 CET 2022 (1646774984565) Waiting for possible Shutdown/StopTestNow/HeapDump/ThreadDump message on port 4445 summary + 889238 in 00:00:15 = 58664.6/s Avg: 0 Min: 0 Max: 26 Err: 0 (0.00%) Active: 2000 Started: 2000 Finished: 0 summary + 2046430 in 00:00:30 = 68214.3/s Avg: 0 Min: 0 Max: 21 Err: 0 (0.00%) Active: 2000 Started: 2000 Finished: 0 summary = 2935668 in 00:00:45 = 65008.8/s Avg: 0 Min: 0 Max: 26 Err: 0 (0.00%) summary + 2015928 in 00:00:30 = 67197.6/s Avg: 0 Min: 0 Max: 20 Err: 0 (0.00%) Active: 2000 Started: 2000 Finished: 0 summary = 4951596 in 00:01:15 = 65882.5/s Avg: 0 Min: 0 Max: 26 Err: 0 (0.00%) summary + 2011866 in 00:00:30 = 67062.2/s Avg: 0 Min: 0 Max: 18 Err: 0 (0.00%) Active: 2000 Started: 2000 Finished: 0 summary = 6963462 in 00:01:45 = 66219.0/s Avg: 0 Min: 0 Max: 26 Err: 0 (0.00%) summary + 1128224 in 00:00:17 = 65874.0/s Avg: 0 Min: 0 Max: 27 Err: 0 (0.00%) Active: 0 Started: 2000 Finished: 2000 summary = 8091686 in 00:02:02 = 66170.7/s Avg: 0 Min: 0 Max: 27 Err: 0 (0.00%) Tidying up ... @ Tue Mar 08 22:31:47 CET 2022 (1646775107127) ... end of run
(without locking): $ ./bin/jmeter -t LAST -n Creating summariser <summary> Created the tree successfully using /home/felix/jmeter-tests/csv-thread-mismatch.jmx Starting standalone test @ Tue Mar 08 22:32:12 CET 2022 (1646775132163) Waiting for possible Shutdown/StopTestNow/HeapDump/ThreadDump message on port 4445 summary + 1197546 in 00:00:18 = 68232.4/s Avg: 0 Min: 0 Max: 27 Err: 0 (0.00%) Active: 2000 Started: 2000 Finished: 0 summary + 2029381 in 00:00:30 = 67670.8/s Avg: 0 Min: 0 Max: 26 Err: 0 (0.00%) Active: 2000 Started: 2000 Finished: 0 summary = 3226927 in 00:00:48 = 67878.1/s Avg: 0 Min: 0 Max: 27 Err: 0 (0.00%) summary + 1888713 in 00:00:30 = 62957.1/s Avg: 0 Min: 0 Max: 23 Err: 0 (0.00%) Active: 2000 Started: 2000 Finished: 0 summary = 5115640 in 00:01:18 = 65974.2/s Avg: 0 Min: 0 Max: 27 Err: 0 (0.00%) summary + 1769647 in 00:00:30 = 58988.2/s Avg: 0 Min: 0 Max: 28 Err: 0 (0.00%) Active: 2000 Started: 2000 Finished: 0 summary = 6885287 in 00:01:48 = 64025.4/s Avg: 0 Min: 0 Max: 28 Err: 0 (0.00%) summary + 977769 in 00:00:14 = 67736.0/s Avg: 0 Min: 0 Max: 16 Err: 0 (0.00%) Active: 0 Started: 2000 Finished: 2000 summary = 7863056 in 00:02:02 = 64464.5/s Avg: 0 Min: 0 Max: 28 Err: 0 (0.00%) Tidying up ... @ Tue Mar 08 22:34:14 CET 2022 (1646775254435) ... end of run
Bob Atkey (Bug 55827): We ran our static analysis tool ThreadSafe [1] on version 2.10 of JMeter, which appeared to uncover a couple of concurrency issues. One of the most interesting was the possibility of an unsynchronised iteration over a synchronised collection in the class 'AbstractTestElement'.
The relevant line is 499: Iterator<Map.Entry<String, JMeterProperty>> iter = propMap.entrySet().iterator();
where the propMap field always contains a synchronised collection created at line 57 in the same file.
The JDK documentation for Collections.synchronizedMap states that:
It appears that the code on line 499 does not correctly synchronize on propMap, leading to the possibility of the non-deterministic behaviour the JDK documentation warns about.
We're not sure that this can actually result in a user-visible bug, but we thought you'd like to know.
We are also planning to use this finding as an example of Android-related concurrency mistakes in an article about ThreadSafe. Obviously, if you, as the developers of JMeter, have any objections to our using this as an example, then we won't.
[1] ThreadSafe is a static analysis tool for Java concurrency, developed by Contemplate Ltd.: http://www.contemplateltd.com/
Severity: major OS: Linux