MassBank / MassBank-web

The web server application and directly connected components for a MassBank web server
13 stars 22 forks source link

Validator / install.sh refresh: out of memory #298

Closed meowcat closed 3 years ago

meowcat commented 3 years ago

Hi,

when processing a large number (1.1e6) of records with the validator (ca 0.2% invalid), it eventually crashes with

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

Now this is just a dev box with 16 GB RAM, not a production server, and I can get more RAM. But does that mean that all records are first composed into memory and then the entire block written to DB? Is there already a way for the user to split this up into chunks, and/or to refresh a DB without removing existing records (so one could do blockwise addition)?

There is no specific place where the error happens, see two examples below, so I guess it's just OOM in general.

Note: a test with 600k records of which 100% are valid worked fine. For the 1.1e6 batch, I had 6k invalid records in the first attempt. I removed all those files (will fix later), and had 1300 failures in the second attempt. The third attempt is still running at currently 23 failures. But I guess the failures aren't a specific problem.

Exception in thread "main" java.lang.OutOfMemoryError
    at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
    at java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:603)
    at java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:678)
    at java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:737)
    at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateParallel(ReduceOps.java:919)
    at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
    at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578)
    at massbank.cli.RefreshDatabase.main(RefreshDatabase.java:91)
Caused by: java.lang.OutOfMemoryError: Java heap space
    at org.petitparser.context.Context.failure(Context.java:78)
    at org.petitparser.context.Context.failure(Context.java:68)
    at org.petitparser.parser.primitive.StringParser.parseOn(StringParser.java:68)
    at org.petitparser.parser.combinators.ChoiceParser.parseOn(ChoiceParser.java:22)
    at org.petitparser.parser.combinators.SequenceParser.parseOn(SequenceParser.java:25)
    at org.petitparser.parser.actions.ActionParser.parseOn(ActionParser.java:37)
    at org.petitparser.parser.combinators.SequenceParser.parseOn(SequenceParser.java:25)
    at org.petitparser.parser.actions.ActionParser.parseOn(ActionParser.java:37)
    at org.petitparser.parser.combinators.DelegateParser.parseOn(DelegateParser.java:25)
    at org.petitparser.parser.actions.ContinuationParser.lambda$parseOn$0(ContinuationParser.java:31)
    at org.petitparser.parser.actions.ContinuationParser$$Lambda$234/0x000000084020c840.apply(Unknown Source)
    at massbank.RecordParserDefinition.lambda$new$24(RecordParserDefinition.java:887)
    at massbank.RecordParserDefinition$$Lambda$174/0x00000008401fd840.apply(Unknown Source)
    at org.petitparser.parser.actions.ContinuationParser.parseOn(ContinuationParser.java:31)
    at org.petitparser.parser.combinators.SequenceParser.parseOn(SequenceParser.java:25)
    at org.petitparser.parser.actions.ActionParser.parseOn(ActionParser.java:37)
    at org.petitparser.parser.repeating.PossessiveRepeatingParser.parseOn(PossessiveRepeatingParser.java:25)
    at org.petitparser.parser.actions.ActionParser.parseOn(ActionParser.java:37)
    at org.petitparser.parser.combinators.OptionalParser.parseOn(OptionalParser.java:23)
    at org.petitparser.parser.combinators.SequenceParser.parseOn(SequenceParser.java:25)
    at org.petitparser.parser.combinators.DelegateParser.parseOn(DelegateParser.java:25)
    at org.petitparser.parser.actions.ContinuationParser.lambda$parseOn$0(ContinuationParser.java:31)
    at org.petitparser.parser.actions.ContinuationParser$$Lambda$234/0x000000084020c840.apply(Unknown Source)
    at massbank.RecordParserDefinition.checkSemantic(RecordParserDefinition.java:1704)
    at massbank.RecordParserDefinition.lambda$new$0(RecordParserDefinition.java:100)
    at massbank.RecordParserDefinition$$Lambda$117/0x00000008401c7840.apply(Unknown Source)
    at org.petitparser.parser.actions.ContinuationParser.parseOn(ContinuationParser.java:31)
    at org.petitparser.parser.combinators.ChoiceParser.parseOn(ChoiceParser.java:22)
    at org.petitparser.parser.combinators.SequenceParser.parseOn(SequenceParser.java:25)
    at org.petitparser.parser.combinators.SequenceParser.parseOn(SequenceParser.java:25)
    at org.petitparser.parser.actions.ActionParser.parseOn(ActionParser.java:37)
    at org.petitparser.parser.combinators.DelegateParser.parseOn(DelegateParser.java:25)

or

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at net.sf.jniinchi.JniInchiWrapper.GetINCHI(Native Method)
    at net.sf.jniinchi.JniInchiWrapper.getInchi(JniInchiWrapper.java:241)
    at org.openscience.cdk.inchi.InChIGenerator.generateInchiFromCDKAtomContainer(InChIGenerator.java:483)
    at org.openscience.cdk.inchi.InChIGenerator.<init>(InChIGenerator.java:174)
    at org.openscience.cdk.inchi.InChIGenerator.<init>(InChIGenerator.java:132)
    at org.openscience.cdk.inchi.InChIGeneratorFactory.getInChIGenerator(InChIGeneratorFactory.java:147)
    at massbank.RecordParserDefinition.lambda$new$21(RecordParserDefinition.java:807)
    at massbank.RecordParserDefinition$$Lambda$177/0x00000008401fe040.apply(Unknown Source)
    at org.petitparser.parser.actions.ContinuationParser.parseOn(ContinuationParser.java:31)
    at org.petitparser.parser.actions.ActionParser.parseOn(ActionParser.java:37)
    at org.petitparser.parser.combinators.SequenceParser.parseOn(SequenceParser.java:25)
    at org.petitparser.parser.combinators.SequenceParser.parseOn(SequenceParser.java:25)
    at org.petitparser.parser.combinators.DelegateParser.parseOn(DelegateParser.java:25)
    at org.petitparser.parser.actions.ContinuationParser.lambda$parseOn$0(ContinuationParser.java:31)
    at org.petitparser.parser.actions.ContinuationParser$$Lambda$239/0x000000084020d840.apply(Unknown Source)
    at massbank.RecordParserDefinition.checkSemantic(RecordParserDefinition.java:1704)
    at massbank.RecordParserDefinition.lambda$new$0(RecordParserDefinition.java:100)
    at massbank.RecordParserDefinition$$Lambda$117/0x00000008401c7840.apply(Unknown Source)
    at org.petitparser.parser.actions.ContinuationParser.parseOn(ContinuationParser.java:31)
    at org.petitparser.parser.combinators.ChoiceParser.parseOn(ChoiceParser.java:22)
    at org.petitparser.parser.combinators.SequenceParser.parseOn(SequenceParser.java:25)
    at org.petitparser.parser.combinators.SequenceParser.parseOn(SequenceParser.java:25)
    at org.petitparser.parser.actions.ActionParser.parseOn(ActionParser.java:37)
    at org.petitparser.parser.combinators.DelegateParser.parseOn(DelegateParser.java:25)
    at org.petitparser.parser.Parser.parse(Parser.java:75)
    at massbank.cli.Validator.validate(Validator.java:89)
    at massbank.cli.Validator.validate(Validator.java:78)
    at massbank.cli.RefreshDatabase.lambda$main$0(RefreshDatabase.java:80)
    at massbank.cli.RefreshDatabase$$Lambda$115/0x00000008401c2840.apply(Unknown Source)
    at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
    at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655)
    at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
meier-rene commented 3 years ago

Hi, I would like to track down this issue. 16GB is decent. We don't want our program to fail on this machine. I only test with the official massbank data, which has less than 100k record files, so 1100k is untested. I can imagine that there is a limit atm, because we parse all content to memory(just because it was easier and worked for me). Do you think that its possible to give me your dataset? Otherwise I would have to construct an artificial dataset of that size.

meowcat commented 3 years ago

Hi, that dataset specifically I can't give you right now, but I could try to finish the second part of the LipidBlast dataset, and the two LipidBlasts should be >1e6 together. (But the LipidBlast have fewer peaks per spectrum, so not sure this triggers the problem.)

The third attempt has now also failed; however, I quickly checked top and saw that Java was only using 4gb ram, so this can maybe be solved just by doing -Xmx8092m or such?

schymane commented 3 years ago

I should let @meier-rene comment but that trick has worked for MetFrag in the past ... ;-)

tsufz commented 3 years ago

Yes, increasing the heap size by the -Xmx switch is the way to go. There is not much what one could change on Java, but it is important to limit the RAM consumption because Java takes it all (if possible). Once a colleague of mine catch 756 GB RAM with MZmine and was called by the server administrator...

meowcat commented 3 years ago

Setting environment: JAVA_OPTS: -Xmx8g and adjusting MassBank-Project/MassBank-lib/target/MassBank-lib/MassBank-lib/bin/RefreshDatabase to use $JAVA_OPTS for running Java seems to be working; it's still running but java is now taking >6G RAM. The limiting factor was the default max RAM percentage of 25% for max heap size, i.e. 4G on 16G RAM. Note that there is now the option -XX:MaxRAMPercentage that would allow changing the percentage directly instead of setting a fixed value.

https://stackoverflow.com/a/57895677/1259675

meowcat commented 3 years ago

It works with 8G for 1.1e6 records. Should I make a PR? Do you want to further follow up?

meier-rene commented 3 years ago

No need for a PR. I'm testing a changed algorithm, which works with 256MB for large number of records. I will check this code in as soon as I finished testing.

meier-rene commented 3 years ago

Fixed with d3beaef. Thank you for reporting.