petermr / tigr2ess

Materials for TIGR2ESS workshop in Delhi Feb 2019 - joint UK(Cambridge) - India project on Food Security.
Other
4 stars 10 forks source link

"Error due to lack of JAVA heap memory while running ocimum2000." #83

Open ambarishK opened 5 years ago

ambarishK commented 5 years ago
Run command -
ami-search-new -p /media/ambarish123/AMBARISH/Ocimumproject18feb --dictionary /media/ambarish123/AMBARISH/dictionary/monoterpenes.xml /media/ambarish123/AMBARISH/dictionary/diterpene.xml /media/ambarish123/AMBARISH/dictionary/triterpene.xml /media/ambarish123/AMBARISH/dictionary/insecticides.xml /media/ambarish123/AMBARISH/dictionary/invasivespecies.xml /media/ambarish123/AMBARISH/dictionary/Ocimumspecies.xml /media/ambarish123/AMBARISH/dictionary/phytochemicals.xml country gene drugs plantparts
Error snippet

specific and generic values.

Generic values (AMISearchTool)
================================
basename            null
cproject            /media/ambarish123/AMBARISH/Ocimumproject18feb
ctree               
cTreeList           2117 trees [/media/ambarish123/AMBARISH/Ocimumproject18feb/30
dryrun              false
excludeBase         null
excludeTrees        null
file types          []
forceMake           false
includeBase         null
includeTrees        null
log4j               
logfile             null
verbose             0

Specific values (AMISearchTool)
================================
dictionaryList       [/media/ambarish123/AMBARISH/dictionary/monoterpenes.xml, /media/ambarish123/AMBARISH/dictionary/diterpene.xml, /media/ambarish123/AMBARISH/dictionary/triterpene.xml, /media/ambarish123/AMBARISH/dictionary/insecticides.xml, /media/ambarish123/AMBARISH/dictionary/invasivespecies.xml, /media/ambarish123/AMBARISH/dictionary/Ocimumspecies.xml, /media/ambarish123/AMBARISH/dictionary/phytochemicals.xml, country, gene, drugs, plantparts]
dictionaryTop        null
dictionarySuffix     [xml]
ignorePlugins        []

Run time error related JAVA heap memory.

................................................................................Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at sun.nio.cs.UTF_8.newEncoder(UTF_8.java:72)
    at java.lang.StringCoding$StringEncoder.<init>(StringCoding.java:282)
    at java.lang.StringCoding$StringEncoder.<init>(StringCoding.java:273)
    at java.lang.StringCoding.encode(StringCoding.java:338)
    at java.lang.String.getBytes(String.java:918)
    at nu.xom.Text.build(Unknown Source)
    at nu.xom.NonVerifyingHandler.flushText(Unknown Source)
    at nu.xom.NonVerifyingHandler.startElement(Unknown Source)
    at org.apache.xerces.parsers.AbstractSAXParser.startElement(Unknown Source)
    at org.apache.xerces.impl.XMLNamespaceBinder.handleStartElement(Unknown Source)
    at org.apache.xerces.impl.XMLNamespaceBinder.startElement(Unknown Source)
    at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown Source)
    at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
    at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
    at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source)
    at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source)
    at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
    at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
    at nu.xom.Builder.build(Unknown Source)
    at nu.xom.Builder.build(Unknown Source)
    at nu.xom.Builder.build(Unknown Source)
    at org.contentmine.eucl.xml.XMLUtil.parseQuietlyToDocument(XMLUtil.java:1197)
    at org.contentmine.cproject.args.DefaultArgProcessor.getScholarlyHtmlElement(DefaultArgProcessor.java:1381)
    at org.contentmine.cproject.files.CTree.ensureScholarlyHtmlElement(CTree.java:1239)
    at org.contentmine.cproject.args.DefaultArgProcessor.extractPSectionElements(DefaultArgProcessor.java:1365)
    at org.contentmine.ami.plugins.AMIArgProcessor.ensureSectionElements(AMIArgProcessor.java:257)
    at org.contentmine.ami.plugins.AMIArgProcessor.runRunMethodsOnChosenArgOptions(AMIArgProcessor.java:228)
    at org.contentmine.cproject.args.DefaultArgProcessor.runAndOutput(DefaultArgProcessor.java:1296)
    at org.contentmine.ami.plugins.word.WordPluginOption.run(WordPluginOption.java:36)
    at org.contentmine.ami.plugins.CommandProcessor.runLegacyPluginOptions(CommandProcessor.java:301)
    at org.contentmine.ami.tools.AMISearchTool.runLegacyCommandProcessor(AMISearchTool.java:128)
    at org.contentmine.ami.tools.AMISearchTool.runSearch(AMISearchTool.java:112)
issues -

I ran the project into the hard drive. There would not have been lack of memory space but it raises error. Please resolve.

petermr commented 5 years ago

It fails at:

    /** convenience method to extract list of HtmlP in element
     *
     * @param htmlElement
     * @return
     */
    public static List<HtmlP> extractSelfAndDescendantIs(HtmlElement
htmlElement) {
        return HtmlP.extractPs(HtmlUtil.getQueryHtmlElements(htmlElement,
ALL_P_XPATH));
    }

called from

   DefaultArgProcessor: public List<? extends Element>
extractPSectionElements(CTree cTree) {
        List<? extends Element> elements = null;
        if (cTree != null) {
            cTree.ensureScholarlyHtmlElement();
            elements =
HtmlP.extractSelfAndDescendantIs(cTree.getHtmlElement());
        }
        return elements;
    }

My guess is it is a huge document and we need to extract the count and then limit the number of extracted elements. Will look in the morning.

On Tue, Feb 26, 2019 at 11:14 AM Ambarish Kumar notifications@github.com wrote:

Run command -

ami-search-new -p /media/ambarish123/AMBARISH/Ocimumproject18feb --dictionary /media/ambarish123/AMBARISH/dictionary/monoterpenes.xml /media/ambarish123/AMBARISH/dictionary/diterpene.xml /media/ambarish123/AMBARISH/dictionary/triterpene.xml /media/ambarish123/AMBARISH/dictionary/insecticides.xml /media/ambarish123/AMBARISH/dictionary/invasivespecies.xml /media/ambarish123/AMBARISH/dictionary/Ocimumspecies.xml /media/ambarish123/AMBARISH/dictionary/phytochemicals.xml country gene drugs plantparts

Error snippet

specific and generic values.

Generic values (AMISearchTool)

basename null cproject /media/ambarish123/AMBARISH/Ocimumproject18feb ctree cTreeList 2117 trees [/media/ambarish123/AMBARISH/Ocimumproject18feb/30 dryrun false excludeBase null excludeTrees null file types [] forceMake false includeBase null includeTrees null log4j logfile null verbose 0

Specific values (AMISearchTool)

dictionaryList [/media/ambarish123/AMBARISH/dictionary/monoterpenes.xml, /media/ambarish123/AMBARISH/dictionary/diterpene.xml, /media/ambarish123/AMBARISH/dictionary/triterpene.xml, /media/ambarish123/AMBARISH/dictionary/insecticides.xml, /media/ambarish123/AMBARISH/dictionary/invasivespecies.xml, /media/ambarish123/AMBARISH/dictionary/Ocimumspecies.xml, /media/ambarish123/AMBARISH/dictionary/phytochemicals.xml, country, gene, drugs, plantparts] dictionaryTop null dictionarySuffix [xml] ignorePlugins []

Run time error related JAVA heap memory.

................................................................................Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at sun.nio.cs.UTF_8.newEncoder(UTF_8.java:72) at java.lang.StringCoding$StringEncoder.(StringCoding.java:282) at java.lang.StringCoding$StringEncoder.(StringCoding.java:273) at java.lang.StringCoding.encode(StringCoding.java:338) at java.lang.String.getBytes(String.java:918) at nu.xom.Text.build(Unknown Source) at nu.xom.NonVerifyingHandler.flushText(Unknown Source) at nu.xom.NonVerifyingHandler.startElement(Unknown Source) at org.apache.xerces.parsers.AbstractSAXParser.startElement(Unknown Source) at org.apache.xerces.impl.XMLNamespaceBinder.handleStartElement(Unknown Source) at org.apache.xerces.impl.XMLNamespaceBinder.startElement(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source) at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source) at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) at nu.xom.Builder.build(Unknown Source) at nu.xom.Builder.build(Unknown Source) at nu.xom.Builder.build(Unknown Source) at org.contentmine.eucl.xml.XMLUtil.parseQuietlyToDocument(XMLUtil.java:1197) at org.contentmine.cproject.args.DefaultArgProcessor.getScholarlyHtmlElement(DefaultArgProcessor.java:1381) at org.contentmine.cproject.files.CTree.ensureScholarlyHtmlElement(CTree.java:1239) at org.contentmine.cproject.args.DefaultArgProcessor.extractPSectionElements(DefaultArgProcessor.java:1365) at org.contentmine.ami.plugins.AMIArgProcessor.ensureSectionElements(AMIArgProcessor.java:257) at org.contentmine.ami.plugins.AMIArgProcessor.runRunMethodsOnChosenArgOptions(AMIArgProcessor.java:228) at org.contentmine.cproject.args.DefaultArgProcessor.runAndOutput(DefaultArgProcessor.java:1296) at org.contentmine.ami.plugins.word.WordPluginOption.run(WordPluginOption.java:36) at org.contentmine.ami.plugins.CommandProcessor.runLegacyPluginOptions(CommandProcessor.java:301) at org.contentmine.ami.tools.AMISearchTool.runLegacyCommandProcessor(AMISearchTool.java:128) at org.contentmine.ami.tools.AMISearchTool.runSearch(AMISearchTool.java:112)

issues -

I ran the project into the hard drive. There would not have been lack of memory space but it raises error. Please resolve.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/petermr/tigr2ess/issues/83, or mute the thread https://github.com/notifications/unsubscribe-auth/AAsxS6ExQqx8-2a2DxHKK222-HKbcv6Mks5vRRccgaJpZM4bR7zz .

-- Peter Murray-Rust Reader Emeritus in Molecular Informatics Unilever Centre, Dept. Of Chemistry University of Cambridge CB2 1EW, UK +44-1223-763069

petermr commented 5 years ago

I can reproduce the problem:

java.lang.OutOfMemoryError: GC overhead limit exceeded
    at java.util.Arrays.copyOf(Arrays.java:3236)
    at java.lang.StringCoding.safeTrim(StringCoding.java:79)
    at java.lang.StringCoding.access$300(StringCoding.java:50)
    at java.lang.StringCoding$StringEncoder.encode(StringCoding.java:305)
    at java.lang.StringCoding.encode(StringCoding.java:344)
    at java.lang.String.getBytes(String.java:918)
    at nu.xom.Text.build(Unknown Source)
    at nu.xom.NonVerifyingHandler.flushText(Unknown Source)
    at nu.xom.NonVerifyingHandler.startElement(Unknown Source)
    at org.apache.xerces.parsers.AbstractSAXParser.startElement(Unknown Source)
    at org.apache.xerces.impl.XMLNamespaceBinder.handleStartElement(Unknown Source)
    at org.apache.xerces.impl.XMLNamespaceBinder.startElement(Unknown Source)
    at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanStartElement(Unknown Source)
    at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
    at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
    at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source)
    at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source)
    at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
    at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
    at nu.xom.Builder.build(Unknown Source)
    at nu.xom.Builder.build(Unknown Source)
    at org.contentmine.eucl.xml.XMLUtil.parseXML(XMLUtil.java:392)
    at org.contentmine.graphics.html.HtmlFactory.parseToXHTML(HtmlFactory.java:702)
    at org.contentmine.graphics.html.HtmlFactory.parse(HtmlFactory.java:647)
    at org.contentmine.graphics.html.HtmlFactory.parse(HtmlFactory.java:622)
    at org.contentmine.cproject.args.DefaultArgProcessor.getScholarlyHtmlElement(DefaultArgProcessor.java:1384)
    at org.contentmine.cproject.files.CTree.ensureScholarlyHtmlElement(CTree.java:1239)
    at org.contentmine.cproject.args.DefaultArgProcessor.extractPSectionElements(DefaultArgProcessor.java:1367)
    at org.contentmine.ami.plugins.AMIArgProcessor.ensureSectionElements(AMIArgProcessor.java:257)
    at org.contentmine.ami.plugins.AMIArgProcessor.runRunMethodsOnChosenArgOptions(AMIArgProcessor.java:228)
    at org.contentmine.cproject.args.DefaultArgProcessor.runAndOutput(DefaultArgProcessor.java:1298)
    at org.contentmine.ami.plugins.word.WordPluginOption.run(WordPluginOption.java:39)

This needs a rewrite of the search options.