OxalisCommunity / oxalis

Oxalis - PEPPOL Access Point open source implementation - Core component
Other
121 stars 90 forks source link

Oxalis incorrectly reports communication-level errors as "participant not found" #497

Closed steinarhenriksen closed 2 years ago

steinarhenriksen commented 3 years ago

Oxalis incorrectly reports communication-level errors during SMP lookup as "participant not found".

Frequent errors seen in logs for recipients known to be in SMP: java.lang.IllegalArgumentException: InvoiceJcr [uuid=..., referenceId=...]: recipient ... not registered in SMP: no.difi.oxalis.api.lang.OxalisTransmissionException: Unable to fetch 'http://smp.difi.no/...'

Adding trace logging reveals SocketTimeoutException as the cause for most of the experienced issues reported as "participant not found": 2020-12-24 10:25:49,007 TRACE [com.itpas.hermes.broker.peppol.AccesspointRouteHelper] (commandPoolHigh-thread-5) OrderJcr [referenceId=...]: recipient ... SMP lookup failed: no.difi.oxalis.api.lang.OxalisTransmissionException: Unable to fetch 'http://smp.difi.no/...' at no.difi.oxalis.outbound.lookup.CachedLookupService.lookup(CachedLookupService.java:73) [oxalis-outbound-4.1.3-SNAPSHOT.jar:] at com.itpas.hermes.broker.peppol.AccesspointRouteHelper.validatePeppolHeader(AccesspointRouteHelper.java:714) [broker-34.3.sar:] at com.itpas.hermes.broker.peppol.AccesspointRouteHelper.routeItem(AccesspointRouteHelper.java:305) [broker-34.3.sar:] at com.itpas.hermes.broker.command.route.AccesspointRoute.routeByAp(AccesspointRoute.java:51) [broker-34.3.sar:] at com.itpas.hermes.broker.command.route.AccesspointRoute.routeItem(AccesspointRoute.java:35) [broker-34.3.sar:] at com.itpas.hermes.broker.command.route.RouteContainer.routeSingle(RouteContainer.java:361) [broker-34.3.sar:] at com.itpas.hermes.broker.command.route.RouteContainer.processInstructedRoutes(RouteContainer.java:302) [broker-34.3.sar:] at com.itpas.hermes.broker.command.route.RouteContainer.routeImpl(RouteContainer.java:186) [broker-34.3.sar:] at com.itpas.hermes.broker.command.route.RouteContainer.routeItem(RouteContainer.java:366) [broker-34.3.sar:] at com.itpas.hermes.broker.command.route.RouteContainer.routeItem(RouteContainer.java:413) [broker-34.3.sar:] at com.itpas.hermes.broker.command.impl.TransformCommand$2.parseBatchItem(TransformCommand.java:224) [broker-34.3.sar:] at com.itpas.hermes.broker.command.impl.TransformCommand$2.parseBatchItem(TransformCommand.java:133) [broker-34.3.sar:] at com.itpas.hermes.broker.command.impl.BatchParserCommand$2.processBean(BatchParserCommand.java:219) [broker-34.3.sar:] at com.itpas.hermes.transform.bean.BeanHandler.visitAfter(BeanHandler.java:86) [transform-34.3.jar:] at org.milyn.delivery.sax.SAXHandler.visitAfter(SAXHandler.java:394) [milyn-smooks-core-1.7.0.jar:] at org.milyn.delivery.sax.SAXHandler.endElement(SAXHandler.java:204) [milyn-smooks-core-1.7.0.jar:] at org.milyn.delivery.SmooksContentHandler.endElement(SmooksContentHandler.java:96) [milyn-smooks-core-1.7.0.jar:] at org.apache.xerces.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:598) at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanEndElement(XMLNSDocumentScannerImpl.java:673) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(XMLDocumentFragmentScannerImpl.java:1645) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:324) at org.apache.xerces.parsers.XML11Configuration.parse(XML11Configuration.java:875) at org.apache.xerces.parsers.XML11Configuration.parse(XML11Configuration.java:798) at org.apache.xerces.parsers.XMLParser.parse(XMLParser.java:108) at org.apache.xerces.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1198) at __redirected.__XMLReaderFactory.parse(__XMLReaderFactory.java:176) [jboss-modules.jar:1.5.2.Final] at org.milyn.delivery.sax.SAXParser.parse(SAXParser.java:76) [milyn-smooks-core-1.7.0.jar:] at org.milyn.delivery.sax.SmooksSAXFilter.doFilter(SmooksSAXFilter.java:86) [milyn-smooks-core-1.7.0.jar:] at org.milyn.delivery.sax.SmooksSAXFilter.doFilter(SmooksSAXFilter.java:64) [milyn-smooks-core-1.7.0.jar:] at org.milyn.Smooks._filter(Smooks.java:526) [milyn-smooks-core-1.7.0.jar:] at org.milyn.Smooks.filterSource(Smooks.java:482) [milyn-smooks-core-1.7.0.jar:] at com.itpas.hermes.transform.bean.InTransformer.toBean(InTransformer.java:129) [transform-34.3.jar:] at com.itpas.hermes.transform.bean.InTransformer.toBean(InTransformer.java:117) [transform-34.3.jar:] at com.itpas.hermes.broker.command.impl.BatchParserCommand.parseBatch(BatchParserCommand.java:171) [broker-34.3.sar:] at com.itpas.hermes.broker.command.impl.TransformCommand.execute(TransformCommand.java:133) [broker-34.3.sar:] at com.itpas.hermes.broker.command.impl.TransformCommand.execute(TransformCommand.java:48) [broker-34.3.sar:] at com.itpas.hermes.broker.command.noncamel.NonCamelCommandProcessor.executeSyncImpl(NonCamelCommandProcessor.java:401) [broker-34.3.sar:] at com.itpas.hermes.broker.command.noncamel.NonCamelCommandProcessor.execute(NonCamelCommandProcessor.java:317) [broker-34.3.sar:] at com.itpas.hermes.broker.command.noncamel.NonCamelCommandProcessor.executeSync(NonCamelCommandProcessor.java:199) [broker-34.3.sar:] at com.itpas.hermes.broker.command.noncamel.NonCamelCommandProcessor.process(NonCamelCommandProcessor.java:432) [broker-34.3.sar:] at com.itpas.hermes.broker.command.noncamel.NonCamelCommandProcessor.onMessage(NonCamelCommandProcessor.java:453) [broker-34.3.sar:] at com.itpas.hermes.broker.command.noncamel.ConcurrentJmsListenerContainer$ReceiveCallback.processMessage(ConcurrentJmsListenerContainer.java:890) [broker-34.3.sar:] at com.itpas.hermes.broker.command.noncamel.ConcurrentJmsListenerContainer$ReceiveCallback.receiveOutsideJms(ConcurrentJmsListenerContainer.java:814) [broker-34.3.sar:] at com.itpas.hermes.broker.command.noncamel.ConcurrentJmsListenerContainer$JmsListenerTask.run(ConcurrentJmsListenerContainer.java:974) [broker-34.3.sar:] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [rt.jar:1.8.0_181] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [rt.jar:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [rt.jar:1.8.0_181] at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_181] Caused by: no.difi.vefa.peppol.lookup.api.LookupException: Unable to fetch 'http://smp.difi.no/...' at no.difi.vefa.peppol.lookup.fetcher.BasicApacheFetcher.fetch(BasicApacheFetcher.java:75) [peppol-lookup-1.1.4.jar:] at no.difi.vefa.peppol.lookup.LookupClient.getServiceMetadata(LookupClient.java:85) [peppol-lookup-1.1.4.jar:] at no.difi.vefa.peppol.lookup.LookupClient.getEndpoint(LookupClient.java:114) [peppol-lookup-1.1.4.jar:] at no.difi.oxalis.outbound.lookup.CachedLookupService.load(CachedLookupService.java:79) [oxalis-outbound-4.1.3-SNAPSHOT.jar:] at no.difi.oxalis.outbound.lookup.CachedLookupService.load(CachedLookupService.java:46) [oxalis-outbound-4.1.3-SNAPSHOT.jar:] at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3529) [guava-25.0-jre.jar:] at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2278) [guava-25.0-jre.jar:] at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2156) [guava-25.0-jre.jar:] at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2046) [guava-25.0-jre.jar:] at com.google.common.cache.LocalCache.get(LocalCache.java:3948) [guava-25.0-jre.jar:] at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3972) [guava-25.0-jre.jar:] at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4957) [guava-25.0-jre.jar:] at no.difi.oxalis.outbound.lookup.CachedLookupService.lookup(CachedLookupService.java:71) [oxalis-outbound-4.1.3-SNAPSHOT.jar:] ... 48 more Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) [rt.jar:1.8.0_181] at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) [rt.jar:1.8.0_181] at java.net.SocketInputStream.read(SocketInputStream.java:171) [rt.jar:1.8.0_181] at java.net.SocketInputStream.read(SocketInputStream.java:141) [rt.jar:1.8.0_181] at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) [httpcore-4.4.12.jar:4.4.12] at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153) [httpcore-4.4.12.jar:4.4.12] at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280) [httpcore-4.4.12.jar:4.4.12] at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138) [httpclient-4.5.10.jar:4.5.10] at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) [httpclient-4.5.10.jar:4.5.10] at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) [httpcore-4.4.12.jar:4.4.12] at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) [httpcore-4.4.12.jar:4.4.12] at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157) [httpclient-4.5.10.jar:4.5.10] at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) [httpcore-4.4.12.jar:4.4.12] at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) [httpcore-4.4.12.jar:4.4.12] at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) [httpclient-4.5.10.jar:4.5.10] at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) [httpclient-4.5.10.jar:4.5.10] at io.opentracing.contrib.apache.http.client.TracingClientExec.handleNetworkProcessing(TracingClientExec.java:140) [opentracing-apache-httpclient-0.2.0.jar:] at io.opentracing.contrib.apache.http.client.TracingClientExec.execute(TracingClientExec.java:82) [opentracing-apache-httpclient-0.2.0.jar:] at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) [httpclient-4.5.10.jar:4.5.10] at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) [httpclient-4.5.10.jar:4.5.10] at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) [httpclient-4.5.10.jar:4.5.10] at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) [httpclient-4.5.10.jar:4.5.10] at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108) [httpclient-4.5.10.jar:4.5.10] at no.difi.vefa.peppol.lookup.fetcher.BasicApacheFetcher.fetch(BasicApacheFetcher.java:58) [peppol-lookup-1.1.4.jar:] ... 60 more

This makes users believe that participants are not present in SMP when they in reality are.

This also makes SMPs appear unstable: participants the users know exists in the SMP are sometimes (in the event of communication errors) apparently not found in the SMP.

Another consequence is that this makes it difficult to handle communication problems by using a retry - because communication problems now are bundled together with "participant not found" which should probably not cause a retry.

The consequences of this problem are amplified by non-ideal TCP-/HTTP-connection pooling where connections are not checked before reuse - which cause frequent problems related to TCP timeout. This is especially true when running against cloud-hosted SMPs since cloud vendors typically have strict timeout configurations. See related PR https://github.com/OxalisCommunity/oxalis/pull/493 .

ubrix commented 3 years ago

I think this is related to and might even be fixed in later vefa.peppol.lookup: https://github.com/OxalisCommunity/vefa-peppol/pull/31

aaron-kumar commented 3 years ago

PR #493 added via commit e6ae52f8c0f3c311f9c959e17ae07767b77f9917 and PR OxalisCommunity/vefa-peppol#31 previously added by commit https://github.com/OxalisCommunity/vefa-peppol/commit/85ea27c96b8cc4db7ca633ae00fbc617d6f678ce. Will be available in 5.0.0.RC2