crossminer / scava

https://eclipse.org/scava/
Eclipse Public License 2.0
18 stars 13 forks source link

metrics produce out of memory errors #446

Closed blueoly closed 4 years ago

blueoly commented 4 years ago

I made some tests about the memory issues that prevented the completion of the tasks for the Eclipse/Castalia use case. I concluded to the following outcomes:

The following metrics did not presented any problem:

"org.eclipse.scava.metricprovider.historic.configuration.docker.smells",
"org.eclipse.scava.metricprovider.historic.configuration.docker.dependencies",
"org.eclipse.scava.metricprovider.historic.configuration.puppet.dependencies",
"org.eclipse.scava.metricprovider.historic.configuration.puppet.implementationsmells",
"org.eclipse.scava.metricprovider.historic.bugs.bugs.BugsHistoricMetricProvider",
"org.eclipse.scava.metricprovider.historic.bugs.comments.CommentsHistoricMetricProvider",
"org.eclipse.scava.metricprovider.historic.bugs.emotions",
"org.eclipse.scava.metricprovider.historic.bugs.emotions.EmotionsHistoricMetricProvider",
"org.eclipse.scava.metricprovider.historic.bugs.newbugs.NewBugsHistoricMetricProvider",
"org.eclipse.scava.metricprovider.historic.bugs.patches.PatchesHistoricMetricProvider",
"org.eclipse.scava.metricprovider.historic.bugs.sentiment.SentimentHistoricMetricProvider",
"org.eclipse.scava.metricprovider.historic.bugs.severity.SeverityHistoricMetricProvider",
"org.eclipse.scava.metricprovider.historic.bugs.severitybugstatus.SeverityBugStatusHistoricMetricProvider",
"org.eclipse.scava.metricprovider.historic.bugs.severityresponsetime.SeverityResponseTimeHistoricMetricProvider",
"org.eclipse.scava.metricprovider.historic.bugs.severitysentiment.SeveritySentimentHistoricMetricProvider",
"org.eclipse.scava.metricprovider.historic.bugs.status.StatusHistoricMetricProvider",
"org.eclipse.scava.metricprovider.historic.bugs.topics.TopicsHistoricMetricProvider",
"org.eclipse.scava.metricprovider.historic.bugs.users.UsersHistoricMetricProvider",
"org.eclipse.scava.metricprovider.historic.bugs.newusers.NewUsersHistoricMetricProvider",
"org.eclipse.scava.factoid.bugs.channelusage",
"org.eclipse.scava.factoid.bugs.responsetime",
"org.eclipse.scava.factoid.bugs.sentiment",
"org.eclipse.scava.factoid.bugs.emotion",
"org.eclipse.scava.factoid.bugs.severity",
"org.eclipse.scava.metricprovider.historic.documentation.sentiment.DocumentationSentimentHistoricMetricProvider",
"org.eclipse.scava.metricprovider.historic.newsgroups.emotions.EmotionsHistoricMetricProvider",
"org.eclipse.scava.metricprovider.historic.newsgroups.severitysentiment.SeveritySentimentHistoricMetricProvider"

But the following produced out of memory errors and crashed the metric platform container:

"trans.rascal.api.numberOfChanges",
"trans.rascal.api.numberOfBreakingChanges",
"trans.rascal.api.numberOfBreakingChanges.historic",

"rascal.generic.churn.commitsToday.historic",
"rascal.generic.churn.churnToday.historic",
"trans.rascal.OO.java.MHF-Java.historic",
"trans.rascal.OO.java.AIF-Java-Quartiles.historic",
"trans.rascal.OO.java.DIT-Java-Quartiles.historic",
"trans.rascal.OO.java.LCC-Java-Quartiles.historic",
"trans.rascal.OO.java.PF-Java.historic"

"trans.rascal.dependency.osgi.unusedOSGiImportedPackages",
"trans.rascal.dependency.osgi.allOSGiBundleDependencies",
"trans.rascal.dependency.osgi.allOSGiDynamicImportedPackages",
"trans.rascal.dependency.osgi.allOSGiPackageDependencies",
"trans.rascal.dependency.osgi.numberOSGiBundleDependencies",
"trans.rascal.dependency.osgi.numberOSGiBundleDependencies.historic",
"trans.rascal.dependency.osgi.numberOSGiPackageDependencies",
"trans.rascal.dependency.maven.numberMavenDependencies",
"trans.rascal.dependency.maven.numberMavenDependencies.historic",
"trans.rascal.dependency.maven.numberUniqueMavenDependencies",
"trans.rascal.dependency.maven.allMavenDependencies",
"trans.rascal.dependency.maven.allOptionalMavenDependencies",
"trans.rascal.dependency.maven.ratioOptionalMavenDependencies",

Apart form the first three metrics, that I tested seperately, I tested the rest in two groups. So I do not know if each one has a problem.

Also, I do not suggest that there is certainly a memory leak on them. Maybe these metrics have increased memory needs and if we increase the RAM that we allocate to the metric platform container, they may work fine. In the respective Dockerfile, in the ENTRYPOINT statement, the eclipse executable is called and I do not know how to increase its allocated memory. I tried a couple of things but nothing worked.

I assign @tdegueul , as all the above all rascal metrics, and @MarcioMateus because he has created the Dockerfile and may has an idea about how to increase the available memory.

mhow2 commented 4 years ago

@blueoly : the memory parameter seems to be in /ossmeter/eclipse.ini But 8GB looks like already a lot of memory to me but let's see what the others have to say.

creat89 commented 4 years ago

Yesterday I tested XWIKI and the metrics number of changes and migration issues Maracas. I gave to the platform 16g but after 20 days of analysis, the platform used almost all of it. I'll post some extra information I recollected once I'll be in the office.

creat89 commented 4 years ago

I don't know if this will help, but I arrived to get this from JavaMC: image

"Thread-3" - Thread t@29
   java.lang.Thread.State: TIMED_WAITING
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for <794b8f37> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    at java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1475)
    at org.eclipse.scava.platform.osgi.analysis.ProjectAnalyser.executeAnalyse(ProjectAnalyser.java:136)
    - locked <1b89f771> (a org.eclipse.scava.platform.osgi.analysis.ProjectAnalyser)
    at org.eclipse.scava.platform.osgi.services.TaskExecutor.run(TaskExecutor.java:26)
    at java.lang.Thread.run(Thread.java:748)   Locked ownable synchronizers:
    - None
"pool-25-thread-1" - Thread t@205
   java.lang.Thread.State: RUNNABLE
    at io.usethesource.vallang.util.WeakWriteLockingHashConsingMap$LookupWrapper.equals(WeakWriteLockingHashConsingMap.java:96)
    at java.util.HashMap$TreeNode.find(HashMap.java:1864)
    at java.util.HashMap$TreeNode.find(HashMap.java:1874)
    at java.util.HashMap$TreeNode.getTreeNode(HashMap.java:1886)
    at java.util.HashMap.getNode(HashMap.java:576)
    at java.util.HashMap.get(HashMap.java:557)
    at io.usethesource.vallang.util.WeakWriteLockingHashConsingMap.get(WeakWriteLockingHashConsingMap.java:123)
    at io.usethesource.vallang.type.TypeFactory.getFromCache(TypeFactory.java:119)
    at io.usethesource.vallang.type.TypeFactory.getOrCreateTuple(TypeFactory.java:207)
    at io.usethesource.vallang.type.TypeFactory.tupleType(TypeFactory.java:317)
    at io.usethesource.vallang.impl.persistent.Tuple.getType(Tuple.java:49)
    at io.usethesource.vallang.impl.persistent.SetWriter.put(SetWriter.java:168)
    at io.usethesource.vallang.impl.persistent.SetWriter$$Lambda$38/150040284.accept(Unknown Source)
    at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
    at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
    at io.usethesource.vallang.impl.persistent.SetWriter.insert(SetWriter.java:198)
    at io.usethesource.vallang.io.binary.message.IValueReader.readSet(IValueReader.java:545)
    at io.usethesource.vallang.io.binary.message.IValueReader.readValue(IValueReader.java:461)
    at io.usethesource.vallang.io.binary.message.IValueReader.readNamedValues(IValueReader.java:746)
    at io.usethesource.vallang.io.binary.message.IValueReader.readConstructor(IValueReader.java:699)
    at io.usethesource.vallang.io.binary.message.IValueReader.readValue(IValueReader.java:452)
    at io.usethesource.vallang.io.binary.message.IValueReader.readValue(IValueReader.java:72)
    at io.usethesource.vallang.io.binary.stream.IValueInputStream.read(IValueInputStream.java:90)
    at org.rascalmpl.library.Prelude.readBinaryValueFile(Prelude.java:3507)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.rascalmpl.interpreter.result.JavaMethod.invoke(JavaMethod.java:230)
    at org.rascalmpl.interpreter.result.JavaMethod.call(JavaMethod.java:162)
    at org.rascalmpl.interpreter.result.OverloadedFunction.callWith(OverloadedFunction.java:416)
    at org.rascalmpl.interpreter.result.OverloadedFunction.call(OverloadedFunction.java:394)
    at org.rascalmpl.semantics.dynamic.Expression$CallOrTree.interpret(Expression.java:531)
    at org.rascalmpl.semantics.dynamic.Statement$Expression.interpret(Statement.java:365)
    at org.rascalmpl.semantics.dynamic.Statement$Return.interpret(Statement.java:783)
    at org.rascalmpl.semantics.dynamic.Statement$IfThenElse.interpret(Statement.java:679)
    at org.rascalmpl.semantics.dynamic.Statement$NonEmptyBlock.interpret(Statement.java:759)
    at org.rascalmpl.semantics.dynamic.Statement$IfThenElse.interpret(Statement.java:679)
    at org.rascalmpl.interpreter.result.RascalFunction.runBody(RascalFunction.java:383)
    at org.rascalmpl.interpreter.result.RascalFunction.call(RascalFunction.java:322)
    at org.rascalmpl.semantics.dynamic.Expression$CallOrTree.interpret(Expression.java:531)
    at org.rascalmpl.semantics.dynamic.Declarator$Default.interpret(Declarator.java:53)
    at org.rascalmpl.semantics.dynamic.LocalVariableDeclaration$Default.interpret(LocalVariableDeclaration.java:36)
    at org.rascalmpl.semantics.dynamic.Statement$VariableDeclaration.interpret(Statement.java:1005)
    at org.rascalmpl.interpreter.result.RascalFunction.runBody(RascalFunction.java:383)
    at org.rascalmpl.interpreter.result.RascalFunction.call(RascalFunction.java:322)
    at org.rascalmpl.semantics.dynamic.Expression$CallOrTree.interpret(Expression.java:531)
    at org.rascalmpl.semantics.dynamic.Declarator$Default.interpret(Declarator.java:53)
    at org.rascalmpl.semantics.dynamic.LocalVariableDeclaration$Default.interpret(LocalVariableDeclaration.java:36)
    at org.rascalmpl.semantics.dynamic.Statement$VariableDeclaration.interpret(Statement.java:1005)
    at org.rascalmpl.interpreter.result.RascalFunction.runBody(RascalFunction.java:383)
    at org.rascalmpl.interpreter.result.RascalFunction.call(RascalFunction.java:292)
    at org.eclipse.scava.metricprovider.rascal.RascalMetricProvider.compute(RascalMetricProvider.java:311)
    - locked <64387459> (a java.lang.Class)
    at org.eclipse.scava.metricprovider.rascal.RascalMetricProvider.measure(RascalMetricProvider.java:213)
    at org.eclipse.scava.metricprovider.rascal.RascalMetricProvider.measure(RascalMetricProvider.java:1)
    at org.eclipse.scava.platform.osgi.analysis.MetricListExecutor.run(MetricListExecutor.java:103)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)   Locked ownable synchronizers:
    - locked <51e7c888> (a java.util.concurrent.ThreadPoolExecutor$Worker)
MarcioMateus commented 4 years ago

@blueoly, do you know how long does it take to get the out of memory errors? Does it happens after a some minutes or after a few hours or days of execution?

If it happens after many hours/days them I think that we should not increase the memory. We would be just delaying the problem.

One docker feature that we may use is to setup a restart policy to automatically restart the oss-app container (and slaves) whenever it crashes or stops responding. I didn't define the option to allow us to better test and identify problems, but if you thing it is need we can implement it.

However if the platforms stops whenever it starts execution the metrics, even for a fresh run, then we probably need to increase the memory to evaluated those projects.

creat89 commented 4 years ago

@MarcioMateus, in my experiments I ran out of memory (16g) in less than 2 hours. Moreover, in the last 30 minutes of analysis, the platform used 5g.

blueoly commented 4 years ago

Hi @MarcioMateus

We are talking about 2 hours of execution maximum and this if I select 2 or 3 metrics. If we combine more problematic metrics, we talk about minutes.

MarcioMateus commented 4 years ago

That's very worrisome.

Right now we are not imposing any memory limit to containers. So, as say by @mhow2, to increase the maximum memory of the oss-app we should /ossmeter/eclipse.ini file.

We cam probably add a variable to the docker-compose file to define the max memory for eclipse, and then edit the eclipse.ini file before launching the app.

blueoly commented 4 years ago

@MarcioMateus do you know the appropriate variables/parameters we should add to docker-compose about eclipse? -vmargs -Xmx16G will do the job?

tdegueul commented 4 years ago

This is very worrisome indeed, and I apologize for the inconvenience caused by our metrics.

@blueoly Could you give me the URL of the projects you are attempting to analyze? Thank you very much for your efforts.

blueoly commented 4 years ago

Hi @tdegueul There is no need to apologize. We all have caused problems to the platform from time to time. And we are not sure if there is a leak or your metrics really need more RAM.

The details of the project are the following:

http://www.eclipse.org/sirius

VCS Repositories : https://github.com/eclipse-sirius/sirius-components https://github.com/eclipse-sirius/sirius-specs git://git.eclipse.org/gitroot/sirius/org.eclipse.sirius.git git://git.eclipse.org/gitroot/sirius/org.eclipse.sirius.legacy.git

MarcioMateus commented 4 years ago

@blueoly

@MarcioMateus do you know the appropriate variables/parameters we should add to docker-compose about eclipse? -vmargs -Xmx16G will do the job?

Right now there is none. I would need to implement the support for that variables first

MarcioMateus commented 4 years ago

@blueoly , actually I may am unnecessarily complicating things.

I think that if you add ,"-vmargs","-Xmx16g" to the end of the entrypoint in the oss-app service, you will override the settings on the eclipse.ini file.

You should have something like the following command to run the platform with a max heap size of 16GB

entrypoint: ["./wait-for-it.sh", "oss-db:27017", "-t", "0", "--", "./eclipse", "-master", "-apiServer", "-worker", "w1", "-config", "prop.properties","-vmargs","-Xmx16g"]
davidediruscio commented 4 years ago

Can we avoid analysing Sirius (at least to complete the Eclipse use case)? I remember we had problems also in the past.

blueoly commented 4 years ago

@davidediruscio I had the same behaviour when I analysed XWIKI and Eclipse Sphinx, even if I increased the memory of metric platform to 12 GB. But I think that the out of memory error was produced when the total memory of the system was used and there was no more to allocate. But the platform crashed even for Sphinx, which is a relatively small project and the memory that metric platform was using was increasing continuously, which is not a good sign.

I used only 3 metrics, namely:

trans.rascal.api.numberOfChanges trans.rascal.api.numberOfBreakingChanges trans.rascal.api.numberOfBreakingChanges.historic

borisbaldassari commented 4 years ago

Just for the records, the projects analysed on the aueb instance (and failing) are iot.hono and technology.jgit. They are far smaller than sirius (resp. 60K and 223K SLOC vs. 773K SLOC for sirius).

borisbaldassari commented 4 years ago

Adding memory solved the issue - the analysis process is indeed huge.