kermitt2 / grobid

A machine learning software for extracting information from scholarly documents
https://grobid.readthedocs.io
Apache License 2.0
3.57k stars 457 forks source link

Batch processing doesn't work #590

Closed soostmeijer closed 4 years ago

soostmeijer commented 4 years ago

Hi,

When I try to batch process my pdf files using the python client. Only one pdfs will actually get converted but then I get the error message:

ERROR [2020-06-10 15:54:52,514] org.grobid.service.process.GrobidRestProcessFiles: An unexpected exception occurs. 
! org.grobid.core.exceptions.GrobidException: [BAD_INPUT_DATA] PDF to XML conversion failed with error code: 1

and

ERROR [2020-06-10 15:55:39,885] org.grobid.service.process.GrobidRestProcessFiles: An unexpected exception occurs. 
! org.grobid.core.exceptions.GrobidException: [TIMEOUT] PDF to XML conversion timed out

After receiving these errors the server gets stuck on

0:0:0:0:0:0:0:1 - - [10/Jun/2020:16:17:18 +0000] "POST /api/processFulltextDocument HTTP/1.1" 500 41 "-" "python-requests/2.23.0" 230310
INFO  [2020-06-10 16:17:37,765] org.grobid.core.factory.GrobidPoolingFactory: Number of Engines in pool active/max: 10/10

And nothing happens...

Any idea what is going wrong?

kermitt2 commented 4 years ago

Hi @soostmeijer, the two first ERROR messages relate to 2 PDF input files than cannot be processed, it could be due to corrupted PDF or various PDF weirdness. Timeout is often because the PDF is too large (but not always).

Then having the service stuck is very unusual, I've never seen this so far. Could you have a look at the service logs (grobid/logs/grobid-service.log) to see if there is some error trace? More details about your environment (OS, JDK version) and the nature of the PDF could help.

soostmeijer commented 4 years ago

Hi @kermitt2 ,

Thank you for your response.

I work on linux and I use java 1.8.0 and the PDFs I am trying to process are research articles.

This is my log message.

INFO  [2020-06-10 15:50:35,568] org.eclipse.jetty.server.handler.ContextHandler: Started i.d.j.MutableServletContextHandler@682618e5{/,null,AVAILABLE}
INFO  [2020-06-10 15:50:35,579] org.eclipse.jetty.server.AbstractConnector: Started application@68b734a8{HTTP/1.1,[http/1.1]}{0.0.0.0:8070}
INFO  [2020-06-10 15:50:35,581] org.eclipse.jetty.server.AbstractConnector: Started admin@1a464fa3{HTTP/1.1,[http/1.1]}{0.0.0.0:8071}
INFO  [2020-06-10 15:50:35,581] org.eclipse.jetty.server.Server: Started @92712ms
INFO  [2020-06-10 15:53:43,604] org.grobid.core.jni.WapitiModel: Loading model: /home/sander/grobid-0.6.0/grobid-home/models/affiliation-address/model.wapiti (size: 2699936)
INFO  [2020-06-10 15:53:48,195] org.grobid.core.jni.WapitiModel: Loading model: /home/sander/grobid-0.6.0/grobid-home/models/name/header/model.wapiti (size: 2225578)
INFO  [2020-06-10 15:53:49,648] org.grobid.core.jni.WapitiModel: Loading model: /home/sander/grobid-0.6.0/grobid-home/models/name/citation/model.wapiti (size: 440148)
INFO  [2020-06-10 15:53:50,118] org.grobid.core.jni.WapitiModel: Loading model: /home/sander/grobid-0.6.0/grobid-home/models/header/model.wapiti (size: 36094028)
INFO  [2020-06-10 15:54:34,405] org.grobid.core.jni.WapitiModel: Loading model: /home/sander/grobid-0.6.0/grobid-home/models/date/model.wapiti (size: 102435)
INFO  [2020-06-10 15:54:34,489] org.grobid.core.jni.WapitiModel: Loading model: /home/sander/grobid-0.6.0/grobid-home/models/citation/model.wapiti (size: 16412787)
INFO  [2020-06-10 15:54:44,965] org.grobid.core.jni.WapitiModel: Loading model: /home/sander/grobid-0.6.0/grobid-home/models/fulltext/model.wapiti (size: 22836546)
INFO  [2020-06-10 15:54:48,189] org.grobid.core.jni.WapitiModel: Loading model: /home/sander/grobid-0.6.0/grobid-home/models/segmentation/model.wapiti (size: 17807323)
INFO  [2020-06-10 15:54:50,784] org.grobid.core.jni.WapitiModel: Loading model: /home/sander/grobid-0.6.0/grobid-home/models/reference-segmenter/model.wapiti (size: 4921245)
INFO  [2020-06-10 15:54:51,465] org.grobid.core.jni.WapitiModel: Loading model: /home/sander/grobid-0.6.0/grobid-home/models/figure/model.wapiti (size: 422671)
INFO  [2020-06-10 15:54:51,597] org.grobid.core.jni.WapitiModel: Loading model: /home/sander/grobid-0.6.0/grobid-home/models/table/model.wapiti (size: 1202011)
INFO  [2020-06-10 15:54:51,764] org.grobid.core.factory.GrobidPoolingFactory: Number of Engines in pool active/max: 1/10
INFO  [2020-06-10 15:54:51,765] org.grobid.core.factory.GrobidPoolingFactory: Number of Engines in pool active/max: 2/10
INFO  [2020-06-10 15:54:51,765] org.grobid.core.factory.GrobidPoolingFactory: Number of Engines in pool active/max: 3/10
INFO  [2020-06-10 15:54:51,765] org.grobid.core.factory.GrobidPoolingFactory: Number of Engines in pool active/max: 4/10
INFO  [2020-06-10 15:54:51,765] org.grobid.core.factory.GrobidPoolingFactory: Number of Engines in pool active/max: 5/10
INFO  [2020-06-10 15:54:51,765] org.grobid.core.factory.GrobidPoolingFactory: Number of Engines in pool active/max: 6/10
INFO  [2020-06-10 15:54:51,765] org.grobid.core.factory.GrobidPoolingFactory: Number of Engines in pool active/max: 7/10
INFO  [2020-06-10 15:54:51,766] org.grobid.core.factory.GrobidPoolingFactory: Number of Engines in pool active/max: 8/10
INFO  [2020-06-10 15:54:51,766] org.grobid.core.factory.GrobidPoolingFactory: Number of Engines in pool active/max: 9/10
INFO  [2020-06-10 15:54:51,766] org.grobid.core.factory.GrobidPoolingFactory: Number of Engines in pool active/max: 10/10
ERROR [2020-06-10 15:54:52,510] org.grobid.core.process.ProcessPdfToXml: pdftoxml process finished with error code: 1. [/home/sander/grobid-0.6.0/grobid-home/pdf2xml/lin-64/pdfalto_server, -blocks, -noImageInline, -fullFontName, -noImage, -annotation, -filesLimit, 2000, /home/sander/grobid-0.6.0/grobid-home/tmp/origin3357730575515763116.pdf, /home/sander/grobid-0.6.0/grobid-home/tmp/PkVu8WZpRJ.lxml]
ERROR [2020-06-10 15:54:52,510] org.grobid.core.process.ProcessPdfToXml: pdftoxml return message: 
Syntax Warning: May not be a PDF file (continuing anyway)
Syntax Error: Couldn't read xref table
Syntax Warning: PDF file is damaged - attempting to reconstruct xref table...
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't read xref table

ERROR [2020-06-10 15:54:52,514] org.grobid.service.process.GrobidRestProcessFiles: An unexpected exception occurs. 
! org.grobid.core.exceptions.GrobidException: [BAD_INPUT_DATA] PDF to XML conversion failed with error code: 1
! at org.grobid.core.document.DocumentSource.processPdfToXmlServerMode(DocumentSource.java:243)
! at org.grobid.core.document.DocumentSource.pdf2xml(DocumentSource.java:146)
! at org.grobid.core.document.DocumentSource.fromPdf(DocumentSource.java:63)
! at org.grobid.core.engines.FullTextParser.processing(FullTextParser.java:111)
! at org.grobid.core.engines.Engine.fullTextToTEIDoc(Engine.java:489)
! at org.grobid.core.engines.Engine.fullTextToTEI(Engine.java:480)
! at org.grobid.service.process.GrobidRestProcessFiles.processFulltextDocument(GrobidRestProcessFiles.java:179)
! at org.grobid.service.GrobidRestService.processFulltext(GrobidRestService.java:234)
! at org.grobid.service.GrobidRestService.processFulltextDocument_post(GrobidRestService.java:189)
! at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
! at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
! at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
! at java.lang.reflect.Method.invoke(Method.java:498)
! at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81)
! at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:144)
! at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:161)
! at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:160)
! at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:99)
! at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:389)
! at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:347)
! at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:102)
! at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:326)
! at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)
! at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
! at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
! at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
! at org.glassfish.jersey.internal.Errors.process(Errors.java:267)
! at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317)
! at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305)
! at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154)
! at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:473)
! at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427)
! at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388)
! at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341)
! at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228)
! at io.dropwizard.jetty.NonblockingServletHolder.handle(NonblockingServletHolder.java:49)
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1623)
! at io.dropwizard.servlets.ThreadNameFilter.doFilter(ThreadNameFilter.java:35)
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
! at io.dropwizard.jersey.filter.AllowedMethodsFilter.handle(AllowedMethodsFilter.java:45)
! at io.dropwizard.jersey.filter.AllowedMethodsFilter.doFilter(AllowedMethodsFilter.java:39)
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
! at org.eclipse.jetty.servlets.CrossOriginFilter.handle(CrossOriginFilter.java:311)
! at org.eclipse.jetty.servlets.CrossOriginFilter.doFilter(CrossOriginFilter.java:265)
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
! at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:89)
! at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:120)
! at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:135)
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
! at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)
! at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
! at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1345)
! at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
! at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)
! at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
! at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)
! at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
! at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
! at com.codahale.metrics.jetty9.InstrumentedHandler.handle(InstrumentedHandler.java:239)
! at io.dropwizard.jetty.RoutingHandler.handle(RoutingHandler.java:52)
! at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:703)
! at io.dropwizard.jetty.BiDiGzipHandler.handle(BiDiGzipHandler.java:67)
! at org.eclipse.jetty.server.handler.RequestLogHandler.handle(RequestLogHandler.java:56)
! at org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:174)
! at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
! at org.eclipse.jetty.server.Server.handle(Server.java:505)
! at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:370)
! at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:267)
! at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
! at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
! at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
! at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
! at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
! at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
! at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:132)
! at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:698)
! at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:804)
! at java.lang.Thread.run(Thread.java:748)
WARN  [2020-06-10 15:54:54,616] org.grobid.core.utilities.LanguageUtilities: Cannot detect language because of: java.lang.IllegalStateException: Cannot read profiles for cybozu language detection from: /home/sander/grobid-0.6.0/grobid-home/language-detection/cybozu/profiles
WARN  [2020-06-10 15:54:54,623] org.grobid.core.utilities.LanguageUtilities: Cannot detect language because of: java.lang.IllegalStateException: Cannot read profiles for cybozu language detection from: /home/sander/grobid-0.6.0/grobid-home/language-detection/cybozu/profiles
WARN  [2020-06-10 15:54:54,630] org.grobid.core.utilities.LanguageUtilities: Cannot detect language because of: java.lang.IllegalStateException: Cannot read profiles for cybozu language detection from: /home/sander/grobid-0.6.0/grobid-home/language-detection/cybozu/profiles
INFO  [2020-06-10 15:55:13,009] org.grobid.core.factory.GrobidPoolingFactory: Number of Engines in pool active/max: 10/10
ERROR [2020-06-10 15:55:39,884] org.grobid.core.process.ProcessPdfToXml: pdftoxml process finished with error code: 143. [/home/sander/grobid-0.6.0/grobid-home/pdf2xml/lin-64/pdfalto_server, -blocks, -noImageInline, -fullFontName, -noImage, -annotation, -filesLimit, 2000, /home/sander/grobid-0.6.0/grobid-home/tmp/origin5218301327111835505.pdf, /home/sander/grobid-0.6.0/grobid-home/tmp/QhsDuJeCbK.lxml]
ERROR [2020-06-10 15:55:39,885] org.grobid.core.process.ProcessPdfToXml: pdftoxml return message: 

ERROR [2020-06-10 15:55:39,885] org.grobid.service.process.GrobidRestProcessFiles: An unexpected exception occurs. 
! org.grobid.core.exceptions.GrobidException: [TIMEOUT] PDF to XML conversion timed out
! at org.grobid.core.document.DocumentSource.processPdfToXmlServerMode(DocumentSource.java:237)
! at org.grobid.core.document.DocumentSource.pdf2xml(DocumentSource.java:146)
! at org.grobid.core.document.DocumentSource.fromPdf(DocumentSource.java:63)
! at org.grobid.core.engines.FullTextParser.processing(FullTextParser.java:111)
! at org.grobid.core.engines.Engine.fullTextToTEIDoc(Engine.java:489)
! at org.grobid.core.engines.Engine.fullTextToTEI(Engine.java:480)
! at org.grobid.service.process.GrobidRestProcessFiles.processFulltextDocument(GrobidRestProcessFiles.java:179)
! at org.grobid.service.GrobidRestService.processFulltext(GrobidRestService.java:234)
! at org.grobid.service.GrobidRestService.processFulltextDocument_post(GrobidRestService.java:189)
! at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
! at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
! at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
! at java.lang.reflect.Method.invoke(Method.java:498)
! at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81)
! at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:144)
! at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:161)
! at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:160)
! at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:99)
! at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:389)
! at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:347)
! at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:102)
! at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:326)
! at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)
! at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
! at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
! at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
! at org.glassfish.jersey.internal.Errors.process(Errors.java:267)
! at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317)
! at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305)
! at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154)
! at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:473)
! at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427)
! at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388)
! at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341)
! at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228)
! at io.dropwizard.jetty.NonblockingServletHolder.handle(NonblockingServletHolder.java:49)
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1623)
! at io.dropwizard.servlets.ThreadNameFilter.doFilter(ThreadNameFilter.java:35)
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
! at io.dropwizard.jersey.filter.AllowedMethodsFilter.handle(AllowedMethodsFilter.java:45)
! at io.dropwizard.jersey.filter.AllowedMethodsFilter.doFilter(AllowedMethodsFilter.java:39)
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
! at org.eclipse.jetty.servlets.CrossOriginFilter.handle(CrossOriginFilter.java:311)
! at org.eclipse.jetty.servlets.CrossOriginFilter.doFilter(CrossOriginFilter.java:265)
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
! at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:89)
! at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:120)
! at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:135)
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
! at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)
! at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
! at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1345)
! at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
! at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)
! at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
! at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)
! at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
! at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
! at com.codahale.metrics.jetty9.InstrumentedHandler.handle(InstrumentedHandler.java:239)
! at io.dropwizard.jetty.RoutingHandler.handle(RoutingHandler.java:52)
! at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:703)
! at io.dropwizard.jetty.BiDiGzipHandler.handle(BiDiGzipHandler.java:67)
! at org.eclipse.jetty.server.handler.RequestLogHandler.handle(RequestLogHandler.java:56)
! at org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:174)
! at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
! at org.eclipse.jetty.server.Server.handle(Server.java:505)
! at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:370)
! at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:267)
! at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
! at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
! at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
! at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
! at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
! at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
! at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:132)
! at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:698)
! at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:804)
! at java.lang.Thread.run(Thread.java:748)
INFO  [2020-06-10 15:55:47,926] org.grobid.core.factory.GrobidPoolingFactory: Number of Engines in pool active/max: 10/10
WARN  [2020-06-10 16:10:04,848] org.eclipse.jetty.server.HttpChannel: /api/processFulltextDocument
! java.lang.OutOfMemoryError: GC overhead limit exceeded
! at java.util.regex.Pattern.compile(Pattern.java:1693)
! at java.util.regex.Pattern.<init>(Pattern.java:1352)
! at java.util.regex.Pattern.compile(Pattern.java:1028)
! at java.lang.String.replaceAll(String.java:2223)
! at org.grobid.core.utilities.UnicodeUtil.normaliseText(UnicodeUtil.java:161)
! at org.grobid.core.analyzers.GrobidDefaultAnalyzer.tokenize(GrobidDefaultAnalyzer.java:76)
! at org.grobid.core.analyzers.GrobidDefaultAnalyzer.tokenize(GrobidDefaultAnalyzer.java:71)
! at org.grobid.core.analyzers.GrobidAnalyzer.tokenize(GrobidAnalyzer.java:114)
! at org.grobid.core.lexicon.FastMatcher.loadTerm(FastMatcher.java:187)
! at org.grobid.core.lexicon.FastMatcher.loadTerm(FastMatcher.java:173)
! at org.grobid.core.lexicon.FastMatcher.loadTerms(FastMatcher.java:152)
! at org.grobid.core.lexicon.FastMatcher.loadTerms(FastMatcher.java:130)
! at org.grobid.core.lexicon.FastMatcher.<init>(FastMatcher.java:47)
! at org.grobid.core.lexicon.Lexicon.initLocations(Lexicon.java:484)
! at org.grobid.core.lexicon.Lexicon.tokenPositionsLocationNames(Lexicon.java:834)
! at org.grobid.core.engines.CitationParser.processing(CitationParser.java:98)
! at org.grobid.core.engines.CitationParser.processing(CitationParser.java:77)
! at org.grobid.core.engines.CitationParser.processingReferenceSection(CitationParser.java:210)
! at org.grobid.core.engines.FullTextParser.processing(FullTextParser.java:227)
! at org.grobid.core.engines.FullTextParser.processing(FullTextParser.java:113)
! at org.grobid.core.engines.Engine.fullTextToTEIDoc(Engine.java:489)
! at org.grobid.core.engines.Engine.fullTextToTEI(Engine.java:480)
! at org.grobid.service.process.GrobidRestProcessFiles.processFulltextDocument(GrobidRestProcessFiles.java:179)
! at org.grobid.service.GrobidRestService.processFulltext(GrobidRestService.java:234)
! at org.grobid.service.GrobidRestService.processFulltextDocument_post(GrobidRestService.java:189)
! at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
! at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
! at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
! at java.lang.reflect.Method.invoke(Method.java:498)
! at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81)
! at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:144)
! at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:161)
! Causing: org.glassfish.jersey.server.ContainerException: java.lang.OutOfMemoryError: GC overhead limit exceeded
! at org.glassfish.jersey.servlet.internal.ResponseWriter.rethrow(ResponseWriter.java:278)
! at org.glassfish.jersey.servlet.internal.ResponseWriter.failure(ResponseWriter.java:260)
! at org.glassfish.jersey.server.ServerRuntime$Responder.process(ServerRuntime.java:509)
! at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:334)
! at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)
! at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
! at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
! at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
! at org.glassfish.jersey.internal.Errors.process(Errors.java:267)
! at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317)
! at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305)
! at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154)
! at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:473)
! ... 47 common frames omitted
! Causing: javax.servlet.ServletException: org.glassfish.jersey.server.ContainerException: java.lang.OutOfMemoryError: GC overhead limit exceeded
! at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:489)
! at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427)
! at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388)
! at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341)
! at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228)
! at io.dropwizard.jetty.NonblockingServletHolder.handle(NonblockingServletHolder.java:49)
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1623)
! at io.dropwizard.servlets.ThreadNameFilter.doFilter(ThreadNameFilter.java:35)
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
! at io.dropwizard.jersey.filter.AllowedMethodsFilter.handle(AllowedMethodsFilter.java:45)
! at io.dropwizard.jersey.filter.AllowedMethodsFilter.doFilter(AllowedMethodsFilter.java:39)
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
! at org.eclipse.jetty.servlets.CrossOriginFilter.handle(CrossOriginFilter.java:311)
! at org.eclipse.jetty.servlets.CrossOriginFilter.doFilter(CrossOriginFilter.java:265)
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
! at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:120)
! at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:135)
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
! at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)
! at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
! at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1345)
! at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
! at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)
! at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
! at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)
! at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
! at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
! at com.codahale.metrics.jetty9.InstrumentedHandler.handle(InstrumentedHandler.java:239)
! at io.dropwizard.jetty.RoutingHandler.handle(RoutingHandler.java:52)
! at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:703)
! at io.dropwizard.jetty.BiDiGzipHandler.handle(BiDiGzipHandler.java:67)
! at org.eclipse.jetty.server.handler.RequestLogHandler.handle(RequestLogHandler.java:56)
! at org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:174)
! at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
! at org.eclipse.jetty.server.Server.handle(Server.java:505)
! at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:370)
! at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:267)
! at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
! at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
! at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
! at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
! at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
! at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
! at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
! at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
! at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:698)
! at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:804)
! at java.lang.Thread.run(Thread.java:748)
INFO  [2020-06-10 16:10:44,750] org.grobid.core.factory.GrobidPoolingFactory: Number of Engines in pool active/max: 10/10
ERROR [2020-06-10 16:13:04,246] org.grobid.core.process.ProcessPdfToXml: pdftoxml process finished with error code: 143. [/home/sander/grobid-0.6.0/grobid-home/pdf2xml/lin-64/pdfalto_server, -blocks, -noImageInline, -fullFontName, -noImage, -annotation, -filesLimit, 2000, /home/sander/grobid-0.6.0/grobid-home/tmp/origin5031980413856160461.pdf, /home/sander/grobid-0.6.0/grobid-home/tmp/ZsTpQM2RMa.lxml]
ERROR [2020-06-10 16:13:04,246] org.grobid.core.process.ProcessPdfToXml: pdftoxml return message: 

ERROR [2020-06-10 16:13:06,668] org.grobid.service.process.GrobidRestProcessFiles: An unexpected exception occurs. 
! org.grobid.core.exceptions.GrobidException: [TIMEOUT] PDF to XML conversion timed out
! at org.grobid.core.document.DocumentSource.processPdfToXmlServerMode(DocumentSource.java:237)
! at org.grobid.core.document.DocumentSource.pdf2xml(DocumentSource.java:146)
! at org.grobid.core.document.DocumentSource.fromPdf(DocumentSource.java:63)
! at org.grobid.core.engines.FullTextParser.processing(FullTextParser.java:111)
! at org.grobid.core.engines.Engine.fullTextToTEIDoc(Engine.java:489)
! at org.grobid.core.engines.Engine.fullTextToTEI(Engine.java:480)
! at org.grobid.service.process.GrobidRestProcessFiles.processFulltextDocument(GrobidRestProcessFiles.java:179)
! at org.grobid.service.GrobidRestService.processFulltext(GrobidRestService.java:234)
! at org.grobid.service.GrobidRestService.processFulltextDocument_post(GrobidRestService.java:189)
! at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
! at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
! at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
! at java.lang.reflect.Method.invoke(Method.java:498)
! at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81)
! at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:144)
! at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:161)
! at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:160)
! at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:99)
! at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:389)
! at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:347)
! at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:102)
! at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:326)
! at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)
! at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
! at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
! at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
! at org.glassfish.jersey.internal.Errors.process(Errors.java:267)
! at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317)
! at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305)
! at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154)
! at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:473)
! at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427)
! at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388)
! at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341)
! at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228)
! at io.dropwizard.jetty.NonblockingServletHolder.handle(NonblockingServletHolder.java:49)
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1623)
! at io.dropwizard.servlets.ThreadNameFilter.doFilter(ThreadNameFilter.java:35)
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
! at io.dropwizard.jersey.filter.AllowedMethodsFilter.handle(AllowedMethodsFilter.java:45)
! at io.dropwizard.jersey.filter.AllowedMethodsFilter.doFilter(AllowedMethodsFilter.java:39)
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
! at org.eclipse.jetty.servlets.CrossOriginFilter.handle(CrossOriginFilter.java:311)
! at org.eclipse.jetty.servlets.CrossOriginFilter.doFilter(CrossOriginFilter.java:265)
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
! at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:89)
! at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:120)
! at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:135)
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
! at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)
! at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
! at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1345)
! at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
! at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)
! at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
! at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)
! at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
! at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
! at com.codahale.metrics.jetty9.InstrumentedHandler.handle(InstrumentedHandler.java:239)
! at io.dropwizard.jetty.RoutingHandler.handle(RoutingHandler.java:52)
! at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:703)
! at io.dropwizard.jetty.BiDiGzipHandler.handle(BiDiGzipHandler.java:67)
! at org.eclipse.jetty.server.handler.RequestLogHandler.handle(RequestLogHandler.java:56)
! at org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:174)
! at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
! at org.eclipse.jetty.server.Server.handle(Server.java:505)
! at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:370)
! at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:267)
! at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
! at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
! at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
! at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
! at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
! at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
! at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
! at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
! at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:698)
! at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:804)
! at java.lang.Thread.run(Thread.java:748)
INFO  [2020-06-10 16:13:47,026] org.grobid.core.factory.GrobidPoolingFactory: Number of Engines in pool active/max: 10/10
ERROR [2020-06-10 16:17:00,649] org.grobid.core.process.ProcessPdfToXml: pdftoxml process finished with error code: 143. [/home/sander/grobid-0.6.0/grobid-home/pdf2xml/lin-64/pdfalto_server, -blocks, -noImageInline, -fullFontName, -noImage, -annotation, -filesLimit, 2000, /home/sander/grobid-0.6.0/grobid-home/tmp/origin7161378569968805259.pdf, /home/sander/grobid-0.6.0/grobid-home/tmp/drpVcMd6kC.lxml]
ERROR [2020-06-10 16:17:00,649] org.grobid.core.process.ProcessPdfToXml: pdftoxml return message: 

ERROR [2020-06-10 16:17:02,784] org.grobid.service.process.GrobidRestProcessFiles: An unexpected exception occurs. 
! org.grobid.core.exceptions.GrobidException: [TIMEOUT] PDF to XML conversion timed out
! at org.grobid.core.document.DocumentSource.processPdfToXmlServerMode(DocumentSource.java:237)
! at org.grobid.core.document.DocumentSource.pdf2xml(DocumentSource.java:146)
! at org.grobid.core.document.DocumentSource.fromPdf(DocumentSource.java:63)
! at org.grobid.core.engines.FullTextParser.processing(FullTextParser.java:111)
! at org.grobid.core.engines.Engine.fullTextToTEIDoc(Engine.java:489)
! at org.grobid.core.engines.Engine.fullTextToTEI(Engine.java:480)
! at org.grobid.service.process.GrobidRestProcessFiles.processFulltextDocument(GrobidRestProcessFiles.java:179)
! at org.grobid.service.GrobidRestService.processFulltext(GrobidRestService.java:234)
! at org.grobid.service.GrobidRestService.processFulltextDocument_post(GrobidRestService.java:189)
! at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
! at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
! at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
! at java.lang.reflect.Method.invoke(Method.java:498)
! at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81)
! at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:144)
! at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:161)
! at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:160)
! at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:99)
! at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:389)
! at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:347)
! at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:102)
! at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:326)
! at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)
! at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
! at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
! at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
! at org.glassfish.jersey.internal.Errors.process(Errors.java:267)
! at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317)
! at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305)
! at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154)
! at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:473)
! at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427)
! at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388)
! at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341)
! at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228)
! at io.dropwizard.jetty.NonblockingServletHolder.handle(NonblockingServletHolder.java:49)
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1623)
! at io.dropwizard.servlets.ThreadNameFilter.doFilter(ThreadNameFilter.java:35)
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
! at io.dropwizard.jersey.filter.AllowedMethodsFilter.handle(AllowedMethodsFilter.java:45)
! at io.dropwizard.jersey.filter.AllowedMethodsFilter.doFilter(AllowedMethodsFilter.java:39)
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
! at org.eclipse.jetty.servlets.CrossOriginFilter.handle(CrossOriginFilter.java:311)
! at org.eclipse.jetty.servlets.CrossOriginFilter.doFilter(CrossOriginFilter.java:265)
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
! at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:89)
! at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:120)
! at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:135)
! at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
! at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)
! at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
! at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1345)
! at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
! at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)
! at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
! at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)
! at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
! at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
! at com.codahale.metrics.jetty9.InstrumentedHandler.handle(InstrumentedHandler.java:239)
! at io.dropwizard.jetty.RoutingHandler.handle(RoutingHandler.java:52)
! at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:703)
! at io.dropwizard.jetty.BiDiGzipHandler.handle(BiDiGzipHandler.java:67)
! at org.eclipse.jetty.server.handler.RequestLogHandler.handle(RequestLogHandler.java:56)
! at org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:174)
! at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
! at org.eclipse.jetty.server.Server.handle(Server.java:505)
! at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:370)
! at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:267)
! at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
! at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
! at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
! at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
! at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
! at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
! at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
! at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
! at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:698)
! at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:804)
! at java.lang.Thread.run(Thread.java:748)
INFO  [2020-06-10 16:17:37,765] org.grobid.core.factory.GrobidPoolingFactory: Number of Engines in pool active/max: 10/10
INFO  [2020-06-10 16:22:35,442] org.eclipse.jetty.server.AbstractConnector: Stopped application@68b734a8{HTTP/1.1,[http/1.1]}{0.0.0.0:8070}
INFO  [2020-06-10 16:22:48,605] org.eclipse.jetty.server.AbstractConnector: Stopped admin@1a464fa3{HTTP/1.1,[http/1.1]}{0.0.0.0:8071}
INFO  [2020-06-10 16:22:53,795] org.eclipse.jetty.server.handler.ContextHandler: Stopped i.d.j.MutableServletContextHandler@682618e5{/,null,UNAVAILABLE}
INFO  [2020-06-10 16:23:01,668] org.eclipse.jetty.server.handler.ContextHandler: Stopped i.d.j.MutableServletContextHandler@4b28a7bf{/,null,UNAVAILABLE}
WARN  [2020-06-10 16:23:34,026] org.eclipse.jetty.util.thread.QueuedThreadPool: InstrumentedQueuedThreadPool[dw]@5489b1f7{STOPPING,8<=9<=1024,i=0,r=-1,q=11}[NO_TRY] Couldn't stop Thread[dw-34 - POST /api/processFulltextDocument,5,main]
WARN  [2020-06-10 16:23:34,027] org.eclipse.jetty.util.thread.QueuedThreadPool: InstrumentedQueuedThreadPool[dw]@5489b1f7{STOPPING,8<=9<=1024,i=0,r=-1,q=11}[NO_TRY] Couldn't stop Thread[dw-43 - POST /api/processFulltextDocument,5,main]
WARN  [2020-06-10 16:23:34,027] org.eclipse.jetty.util.thread.QueuedThreadPool: InstrumentedQueuedThreadPool[dw]@5489b1f7{STOPPING,8<=9<=1024,i=0,r=-1,q=11}[NO_TRY] Couldn't stop Thread[dw-36 - POST /api/processFulltextDocument,5,main]
WARN  [2020-06-10 16:23:34,027] org.eclipse.jetty.util.thread.QueuedThreadPool: InstrumentedQueuedThreadPool[dw]@5489b1f7{STOPPING,8<=9<=1024,i=0,r=-1,q=11}[NO_TRY] Couldn't stop Thread[dw-35 - POST /api/processFulltextDocument,5,main]
WARN  [2020-06-10 16:23:34,027] org.eclipse.jetty.util.thread.QueuedThreadPool: InstrumentedQueuedThreadPool[dw]@5489b1f7{STOPPING,8<=9<=1024,i=0,r=-1,q=11}[NO_TRY] Couldn't stop Thread[dw-39 - POST /api/processFulltextDocument,5,main]
WARN  [2020-06-10 16:23:34,027] org.eclipse.jetty.util.thread.QueuedThreadPool: InstrumentedQueuedThreadPool[dw]@5489b1f7{STOPPING,8<=9<=1024,i=0,r=-1,q=11}[NO_TRY] Couldn't stop Thread[dw-38 - POST /api/processFulltextDocument,5,main]
WARN  [2020-06-10 16:23:34,027] org.eclipse.jetty.util.thread.QueuedThreadPool: InstrumentedQueuedThreadPool[dw]@5489b1f7{STOPPING,8<=9<=1024,i=0,r=-1,q=11}[NO_TRY] Couldn't stop Thread[dw-19 - POST /api/processFulltextDocument,5,main]
WARN  [2020-06-10 16:23:34,027] org.eclipse.jetty.util.thread.QueuedThreadPool: InstrumentedQueuedThreadPool[dw]@5489b1f7{STOPPING,8<=9<=1024,i=0,r=-1,q=11}[NO_TRY] Couldn't stop Thread[dw-21 - POST /api/processFulltextDocument,5,main]
WARN  [2020-06-10 16:23:34,028] org.eclipse.jetty.util.thread.QueuedThreadPool: InstrumentedQueuedThreadPool[dw]@5489b1f7{STOPPING,8<=9<=1024,i=0,r=-1,q=11}[NO_TRY] Couldn't stop Thread[dw-23 - POST /api/processFulltextDocument,5,main]
kermitt2 commented 4 years ago

It looks like your JVM is going out-of-memory:

! java.lang.OutOfMemoryError: GC overhead limit exceeded

What's the amount of memory on your machine?

There are various ways to deal with memory constraints in Grobid, the most obvious being:

soostmeijer commented 4 years ago

Thank you for your suggestions,

I have 8GB of ram on my machine.

It seems to work when I reduce the number of parallel processes to 1, but my computer freezes (sometimes for several minutes) while converting the PDF files.

kermitt2 commented 4 years ago

With 8GB of memory and using 1 thread only for Grobid, your computer should not freeze, this is not normal at all. Freezing likely means swapping, I guess it could be either that you have other process running using quite a lot of memory (2-4GB or more, thus the swapping) or that you JVM is bounded to a maximum amount of memory which is too low and the GC is overloading the JVM.

It's hard to help on this, but you could try with the docker image and see if you still have similar issue or test with another machine.

soostmeijer commented 4 years ago

I have already tried to do it on another machine and on that machine it works.

Thank you for your help!