snowplow / enrich

Snowplow Enrichment jobs and library
https://snowplowanalytics.com
Other
21 stars 39 forks source link

common: add error handling to YAUAA enrichment #732

Open benjben opened 1 year ago

benjben commented 1 year ago

If UserAgentAnalyzer.parse() throws an exception, it is not caught directly and a generic error bad row is emitted (here).

Example exception:

java.lang.ArrayIndexOutOfBoundsException: Index 1 out of bounds for length 1
    at nl.basjes.parse.useragent.clienthints.ClientHintsAnalyzer.improveLayoutEngineAndAgentInfo(ClientHintsAnalyzer.java:325)
    at nl.basjes.parse.useragent.clienthints.ClientHintsAnalyzer.merge(ClientHintsAnalyzer.java:151)
    at nl.basjes.parse.useragent.AbstractUserAgentAnalyzerDirect.parse(AbstractUserAgentAnalyzerDirect.java:212)
    at nl.basjes.parse.useragent.AbstractUserAgentAnalyzer.lambda$parse$0(AbstractUserAgentAnalyzer.java:234)
    at com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$13(BoundedLocalCache.java:2550)
    at java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1908)
    at com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2548)
    at com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2531)
    at com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:110)
    at nl.basjes.parse.useragent.AbstractUserAgentAnalyzer.parse(AbstractUserAgentAnalyzer.java:234)
    at nl.basjes.parse.useragent.AbstractUserAgentAnalyzerDirect.parse(AbstractUserAgentAnalyzerDirect.java:188)
    at com.snowplowanalytics.snowplow.enrich.common.enrichments.registry.YauaaEnrichment.analyzeUserAgent(YauaaEnrichment.scala:113)
    at com.snowplowanalytics.snowplow.enrich.common.enrichments.registry.YauaaEnrichment.getYauaaContext(YauaaEnrichment.scala:95)
    at com.snowplowanalytics.snowplow.enrich.common.enrichments.EnrichmentManager$.$anonfun$getYauaaContext$2(EnrichmentManager.scala:699)
    at scala.Option.map(Option.scala:230)
    at com.snowplowanalytics.snowplow.enrich.common.enrichments.EnrichmentManager$.$anonfun$getYauaaContext$1(EnrichmentManager.scala:699)
    at com.snowplowanalytics.snowplow.enrich.common.enrichments.EnrichmentManager$EStateT$.$anonfun$fromEither$1(EnrichmentManager.scala:150)
    at com.snowplowanalytics.snowplow.enrich.common.enrichments.EnrichmentManager$EStateT$.$anonfun$fromEitherF$1(EnrichmentManager.scala:158)
    at scala.Function1.$anonfun$andThen$1(Function1.scala:57)
    at cats.data.AndThen.loop$1(AndThen.scala:107)
    at cats.data.AndThen.runLoop(AndThen.scala:116)
    at cats.data.AndThen.apply(AndThen.scala:68)
    at cats.data.IndexedStateT.$anonfun$run$1(IndexedStateT.scala:64)
    at flatMap @ com.snowplowanalytics.snowplow.enrich.common.enrichments.EnrichmentManager$.accState(EnrichmentManager.scala:225)
    at runS @ com.snowplowanalytics.snowplow.enrich.common.enrichments.EnrichmentManager$.runEnrichments(EnrichmentManager.scala:118)
    at map @ com.snowplowanalytics.snowplow.enrich.common.enrichments.EnrichmentManager$EStateT$.$anonfun$fromEitherF$1(EnrichmentManager.scala:158)
    at apply @ org.http4s.client.blaze.Http1Connection.parsePrelude(Http1Connection.scala:390)
    at flatMap @ com.snowplowanalytics.snowplow.enrich.common.enrichments.EnrichmentManager$.accState(EnrichmentManager.scala:225)
    at runS @ com.snowplowanalytics.snowplow.enrich.common.enrichments.EnrichmentManager$.runEnrichments(EnrichmentManager.scala:118)
    at map @ com.snowplowanalytics.snowplow.enrich.common.enrichments.EnrichmentManager$EStateT$.$anonfun$fromEitherF$1(EnrichmentManager.scala:158)
    at apply @ org.http4s.client.blaze.Http1Connection.parsePrelude(Http1Connection.scala:390)
    at flatMap @ com.snowplowanalytics.snowplow.enrich.common.enrichments.EnrichmentManager$.accState(EnrichmentManager.scala:225)
    at runS @ com.snowplowanalytics.snowplow.enrich.common.enrichments.EnrichmentManager$.runEnrichments(EnrichmentManager.scala:118)
    at map @ com.snowplowanalytics.snowplow.enrich.common.enrichments.EnrichmentManager$EStateT$.$anonfun$fromEitherF$1(EnrichmentManager.scala:158)
    at apply @ org.http4s.client.blaze.Http1Connection.parsePrelude(Http1Connection.scala:390)
    at flatMap @ com.snowplowanalytics.snowplow.enrich.common.enrichments.EnrichmentManager$.accState(EnrichmentManager.scala:225)
    at runS @ com.snowplowanalytics.snowplow.enrich.common.enrichments.EnrichmentManager$.runEnrichments(EnrichmentManager.scala:118)
    at map @ com.snowplowanalytics.snowplow.enrich.common.enrichments.EnrichmentManager$EStateT$.$anonfun$fromEitherF$1(EnrichmentManager.scala:158)
    at apply @ org.http4s.client.blaze.Http1Connection.parsePrelude(Http1Connection.scala:390)

This specific error was fixed in newer version of YAUAA, but still we should add error handling and emit an enrichment failure bad row if the enrichment fails. This will make it possible to reproduce the error, which is not the case at the moment, we don't know the faulty value.

nielsbasjes commented 1 year ago

I recommend updating to the latest version also. https://github.com/snowplow/enrich/pull/733