angelozerr / eclipse-wtp-xml-search

Eclipse WTP/XML Search
Other
12 stars 6 forks source link

xml searcher can use too much memory when indexing files to search #38

Closed gamerson closed 9 years ago

gamerson commented 9 years ago

See released Liferay IDE issue https://issues.liferay.com/browse/IDE-1817

The issue is that when someone has the Liferay portal source open there are several extremely large XML files in the source tree, namely javadocs-all.xml which is 23Mb.

XML Search when it is indexing xml files and trying to determine if it should include them in the search it is running VM out of memory causes a GC panic.

We did a profile and found the method that is exhausting the memory, here is the trace from yourkit profiler.


Worker-4 <--- Frozen for at least 36 sec
 BasicStructuredDocumentRegion.java:50002
org.eclipse.wst.sse.core.internal.text.BasicStructuredDocumentRegion.getStart() BasicStructuredDocumentRegion.java:65532
org.eclipse.wst.xml.core.internal.text.XMLStructuredDocumentRegion.getStart() XMLStructuredDocumentRegion.java:65531
org.eclipse.wst.xml.core.internal.parser.XMLSourceParser.parseNodes() XMLSourceParser.java:394
org.eclipse.wst.xml.core.internal.parser.XMLSourceParser.getDocumentRegions() XMLSourceParser.java:173
org.eclipse.wst.sse.core.internal.text.StructuredDocumentReParser._core_reparse_text(int, int) StructuredDocumentReParser.java:355
org.eclipse.wst.sse.core.internal.text.StructuredDocumentReParser.core_reparse(int, int, CoreNodeList, boolean) StructuredDocumentReParser.java:752
org.eclipse.wst.xml.core.internal.parser.XMLStructuredDocumentReParser.core_reparse(int, int, CoreNodeList, boolean) XMLStructuredDocumentReParser.java:65531
org.eclipse.wst.sse.core.internal.text.StructuredDocumentReParser.reparse(IStructuredDocumentRegion, IStructuredDocumentRegion) StructuredDocumentReParser.java:1401
org.eclipse.wst.xml.core.internal.parser.XMLStructuredDocumentReParser.reparse(IStructuredDocumentRegion, IStructuredDocumentRegion) XMLStructuredDocumentReParser.java:65531
org.eclipse.wst.sse.core.internal.text.StructuredDocumentReParser.reparse() StructuredDocumentReParser.java:1333
org.eclipse.wst.xml.core.internal.parser.XMLStructuredDocumentReParser.reparse() XMLStructuredDocumentReParser.java:65531
org.eclipse.wst.sse.core.internal.text.BasicStructuredDocument.updateModel(Object, int, int, String) BasicStructuredDocument.java:2713
org.eclipse.wst.sse.core.internal.text.BasicStructuredDocument.internalReplaceText(Object, int, int, String, long, boolean) BasicStructuredDocument.java:1923
org.eclipse.wst.sse.core.internal.text.BasicStructuredDocument.replaceText(Object, int, int, String, long, boolean) BasicStructuredDocument.java:2423
org.eclipse.wst.sse.core.internal.text.BasicStructuredDocument.set(String, long) BasicStructuredDocument.java:2935
org.eclipse.wst.sse.core.internal.text.JobSafeStructuredDocument.set(String, long) JobSafeStructuredDocument.java:65531
org.eclipse.core.internal.filebuffers.ResourceTextFileBuffer.setDocumentContent(IDocument, IFile, String) ResourceTextFileBuffer.java:580
org.eclipse.core.internal.filebuffers.ResourceTextFileBuffer.initializeFileBufferContent(IProgressMonitor) ResourceTextFileBuffer.java:288
org.eclipse.core.internal.filebuffers.ResourceFileBuffer.create(IPath, IProgressMonitor) ResourceFileBuffer.java:247
org.eclipse.core.internal.filebuffers.TextFileBufferManager.connect(IPath, LocationKind, IProgressMonitor) TextFileBufferManager.java:112
org.eclipse.wst.sse.core.internal.FileBufferModelManager.getModel(IFile) FileBufferModelManager.java:786
org.eclipse.wst.sse.core.internal.model.ModelManagerImpl._doCommonGetModel(IFile, String, ModelManagerImpl$SharedObject, ModelManagerImpl$ReadEditType) ModelManagerImpl.java:545
org.eclipse.wst.sse.core.internal.model.ModelManagerImpl._commonGetModel(IFile, String, ModelManagerImpl$ReadEditType, String, String) ModelManagerImpl.java:509
org.eclipse.wst.sse.core.internal.model.ModelManagerImpl._commonGetModel(IFile, ModelManagerImpl$ReadEditType, String, String) ModelManagerImpl.java:482
org.eclipse.wst.sse.core.internal.model.ModelManagerImpl.getModelForRead(IFile) ModelManagerImpl.java:1428
org.eclipse.wst.xml.search.core.util.DOMUtils.getStructuredModelContentTypeId(IFile) DOMUtils.java:493
org.eclipse.wst.xml.search.editor.internal.indexing.XMLReferencesFileVisitor.isXMLReferenceResource(IResource) XMLReferencesFileVisitor.java:94
org.eclipse.wst.xml.search.editor.internal.indexing.XMLReferencesFileVisitor.visit(IResourceProxy) XMLReferencesFileVisitor.java:56
org.eclipse.core.internal.resources.Resource$1.visitElement(ElementTree, IPathRequestor, Object) Resource.java:85
org.eclipse.core.internal.watson.ElementTreeIterator.doIteration(DataTreeNode, IElementContentVisitor) ElementTreeIterator.java:86<5 recursive calls>
org.eclipse.core.internal.watson.ElementTreeIterator.iterate(IElementContentVisitor) ElementTreeIterator.java:127
org.eclipse.core.internal.resources.Resource.accept(IResourceProxyVisitor, int, int) Resource.java:95
org.eclipse.core.internal.resources.Resource.accept(IResourceProxyVisitor, int) Resource.java:52
org.eclipse.wst.xml.search.editor.internal.indexing.XMLReferencesIndexManager.run(IProject, Map, IProgressMonitor) XMLReferencesIndexManager.java:118
org.eclipse.wst.xml.search.editor.internal.indexing.XMLReferencesIndexManager.indexFilesIfNeeded(IProject, IProgressMonitor) XMLReferencesIndexManager.java:69
org.eclipse.wst.xml.search.editor.internal.indexing.XMLReferencesIndexManager.getIndexedFiles(IProject, String, IProgressMonitor) XMLReferencesIndexManager.java:53
org.eclipse.wst.xml.search.editor.internal.jdt.search.XMLReferenceJavaSearchParticipant.search(ISearchRequestor, String, IProgressMonitor, IProject, IXMLSearchDOMNodeCollector, String, IXMLReference) XMLReferenceJavaSearchParticipant.java:249
org.eclipse.wst.xml.search.editor.internal.jdt.search.XMLReferenceJavaSearchParticipant.search(ISearchRequestor, String, String, IProject, Collection, IXMLReferenceTo$ToType, IProgressMonitor) XMLReferenceJavaSearchParticipant.java:210
org.eclipse.wst.xml.search.editor.internal.jdt.search.XMLReferenceJavaSearchParticipant.searchXMLReferences(IJavaSearchScope, ISearchRequestor, String, String, IJavaProject, Collection, IXMLReferenceTo$ToType, SubProgressMonitor) XMLReferenceJavaSearchParticipant.java:185
org.eclipse.wst.xml.search.editor.internal.jdt.search.XMLReferenceJavaSearchParticipant.search(ISearchRequestor, QuerySpecification, IProgressMonitor) XMLReferenceJavaSearchParticipant.java:172
org.eclipse.jdt.internal.ui.search.JavaSearchQuery$2.run() JavaSearchQuery.java:164
org.eclipse.core.runtime.SafeRunner.run(ISafeRunnable) SafeRunner.java:42
org.eclipse.jdt.internal.ui.search.JavaSearchQuery.run(IProgressMonitor) JavaSearchQuery.java:170
org.eclipse.search2.internal.ui.InternalSearchUI$InternalSearchJob.run(IProgressMonitor) InternalSearchUI.java:91
org.eclipse.core.internal.jobs.Worker.run() Worker.java:54
gamerson commented 9 years ago

It seems that DOMUtils.getStructuredModelContentTypeId(IFile) is doing too much just to determine the content type. A call to a more lightweight API should be able to return the same information but not use such a heavyweight API call. Will seen pull request soon.