snowplow / snowplow-mini

An easily-deployable, single-instance version of Snowplow
Other
125 stars 33 forks source link

Fix BigInt binary incompatibility #177

Open chuwy opened 6 years ago

chuwy commented 6 years ago

I'm actually 99% sure that problem lays not in Snowplow Mini (probably in Stream Enrich), but it is the only place where we can easily test and reproduce it, so leaving here.

Having very big numbers as JSON Schema's maximum (e.g. 10000000000000000000000000000000000000000000000000) our users get following error:

Unexpected error processing events: com.google.common.util.concurrent.ExecutionError: com.google.common.util.concurrent.ExecutionError: java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.node.JsonNodeFactory.numberNode(Ljava/math/BigInteger;)Lcom/fasterxml/jackson/databind/node/NumericNode;
    at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2205)
    at com.google.common.cache.LocalCache.get(LocalCache.java:3953)
    at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3957)
    at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4875)
    at com.github.fge.jsonschema.core.processing.CachingProcessor.process(CachingProcessor.java:109)
    at com.github.fge.jsonschema.processors.validation.ValidationProcessor.process(ValidationProcessor.java:84)
    at com.github.fge.jsonschema.processors.validation.ValidationProcessor.processObject(ValidationProcessor.java:180)
    at com.github.fge.jsonschema.processors.validation.ValidationProcessor.process(ValidationProcessor.java:122)
    at com.github.fge.jsonschema.processors.validation.ValidationProcessor.process(ValidationProcessor.java:49)
    at com.github.fge.jsonschema.core.processing.ProcessingResult.of(ProcessingResult.java:79)
    at com.github.fge.jsonschema.core.processing.ProcessingResult.uncheckedResult(ProcessingResult.java:100)
    at com.github.fge.jsonschema.main.JsonValidator.validateUnchecked(JsonValidator.java:150)
    at com.github.fge.jsonschema.main.JsonValidator.validateUnchecked(JsonValidator.java:171)
    at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$.validateAgainstSchema(ValidatableJsonMethods.scala:42)
    at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$$anonfun$validateAndIdentifySchema$2$$anonfun$apply$3$$anonfun$apply$4.apply(ValidatableJsonMethods.scala:75)
    at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$$anonfun$validateAndIdentifySchema$2$$anonfun$apply$3$$anonfun$apply$4.apply(ValidatableJsonMethods.scala:74)
    at scalaz.Validation$class.flatMap(Validation.scala:139)
    at scalaz.Success.flatMap(Validation.scala:345)
    at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$$anonfun$validateAndIdentifySchema$2$$anonfun$apply$3.apply(ValidatableJsonMethods.scala:74)
    at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$$anonfun$validateAndIdentifySchema$2$$anonfun$apply$3.apply(ValidatableJsonMethods.scala:73)
    at scalaz.Validation$class.flatMap(Validation.scala:139)
    at scalaz.Success.flatMap(Validation.scala:345)
    at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$$anonfun$validateAndIdentifySchema$2.apply(ValidatableJsonMethods.scala:73)
    at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$$anonfun$validateAndIdentifySchema$2.apply(ValidatableJsonMethods.scala:70)
    at scalaz.Validation$class.flatMap(Validation.scala:139)
    at scalaz.Success.flatMap(Validation.scala:345)
    at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$.validateAndIdentifySchema(ValidatableJsonMethods.scala:70)
    at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$.validateAndIdentifySchema(ValidatableJsonMethods.scala:36)
    at com.snowplowanalytics.iglu.client.validation.Validatable$ValidatableOps.validateAndIdentifySchema(Validatable.scala:144)
    at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$$anonfun$4$$anonfun$apply$6.apply(Shredder.scala:202)
    at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$$anonfun$4$$anonfun$apply$6.apply(Shredder.scala:202)
    at scala.collection.immutable.List.map(List.scala:284)
    at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$$anonfun$4.apply(Shredder.scala:202)
    at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$$anonfun$4.apply(Shredder.scala:201)
    at scalaz.Validation$class.map(Validation.scala:112)
    at scalaz.Success.map(Validation.scala:345)
    at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$.validate(Shredder.scala:201)
    at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$.extractAndValidateContexts(Shredder.scala:151)
    at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$.extractAndValidateCustomContexts(Shredder.scala:127)
    at com.snowplowanalytics.snowplow.enrich.common.enrichments.EnrichmentManager$.enrichEvent(EnrichmentManager.scala:426)
    at com.snowplowanalytics.snowplow.enrich.common.EtlPipeline$$anonfun$1$$anonfun$apply$1$$anonfun$apply$2$$anonfun$apply$3.apply(EtlPipeline.scala:92)
    at com.snowplowanalytics.snowplow.enrich.common.EtlPipeline$$anonfun$1$$anonfun$apply$1$$anonfun$apply$2$$anonfun$apply$3.apply(EtlPipeline.scala:91)
    at scalaz.NonEmptyList$class.map(NonEmptyList.scala:23)
    at scalaz.NonEmptyListFunctions$$anon$4.map(NonEmptyList.scala:207)
    at com.snowplowanalytics.snowplow.enrich.common.EtlPipeline$$anonfun$1$$anonfun$apply$1$$anonfun$apply$2.apply(EtlPipeline.scala:91)
    at com.snowplowanalytics.snowplow.enrich.common.EtlPipeline$$anonfun$1$$anonfun$apply$1$$anonfun$apply$2.apply(EtlPipeline.scala:88)
    at scalaz.Validation$class.map(Validation.scala:112)
    at scalaz.Success.map(Validation.scala:345)
    at com.snowplowanalytics.snowplow.enrich.common.EtlPipeline$$anonfun$1$$anonfun$apply$1.apply(EtlPipeline.scala:88)
    at com.snowplowanalytics.snowplow.enrich.common.EtlPipeline$$anonfun$1$$anonfun$apply$1.apply(EtlPipeline.scala:85)
    at scala.Option.map(Option.scala:146)
    at com.snowplowanalytics.snowplow.enrich.common.EtlPipeline$$anonfun$1.apply(EtlPipeline.scala:85)
    at com.snowplowanalytics.snowplow.enrich.common.EtlPipeline$$anonfun$1.apply(EtlPipeline.scala:82)
    at scalaz.Validation$class.map(Validation.scala:112)
    at scalaz.Success.map(Validation.scala:345)
    at com.snowplowanalytics.snowplow.enrich.common.EtlPipeline$.processEvents(EtlPipeline.scala:82)
    at com.snowplowanalytics.snowplow.enrich.stream.sources.Source.enrichEvents(Source.scala:137)
    at com.snowplowanalytics.snowplow.enrich.stream.sources.Source$$anonfun$5.apply(Source.scala:162)
    at com.snowplowanalytics.snowplow.enrich.stream.sources.Source$$anonfun$5.apply(Source.scala:162)
    at scala.collection.immutable.List.flatMap(List.scala:338)
    at com.snowplowanalytics.snowplow.enrich.stream.sources.Source.enrichAndStoreEvents(Source.scala:162)
    at com.snowplowanalytics.snowplow.enrich.stream.sources.NsqSource$$anon$3.message(NsqSource.scala:81)
    at com.snowplowanalytics.client.nsq.NSQConsumer.lambda$processMessage$1(NSQConsumer.java:99)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: com.google.common.util.concurrent.ExecutionError: java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.node.JsonNodeFactory.numberNode(Ljava/math/BigInteger;)Lcom/fasterxml/jackson/databind/node/NumericNode;
    at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2205)
    at com.google.common.cache.LocalCache.get(LocalCache.java:3953)
    at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3957)
    at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4875)
    at com.github.fge.jsonschema.core.processing.CachingProcessor.process(CachingProcessor.java:109)
    at com.github.fge.jsonschema.processors.validation.ValidationChain.process(ValidationChain.java:114)
    at com.github.fge.jsonschema.processors.validation.ValidationChain.process(ValidationChain.java:56)
    at com.github.fge.jsonschema.core.processing.ProcessorMap$Mapper.process(ProcessorMap.java:166)
    at com.github.fge.jsonschema.core.processing.ProcessingResult.of(ProcessingResult.java:79)
    at com.github.fge.jsonschema.core.processing.CachingProcessor$1.load(CachingProcessor.java:128)
    at com.github.fge.jsonschema.core.processing.CachingProcessor$1.load(CachingProcessor.java:120)
    at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3542)
    at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2323)
    at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2286)
    at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2201)
    ... 65 more
Caused by: java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.node.JsonNodeFactory.numberNode(Ljava/math/BigInteger;)Lcom/fasterxml/jackson/databind/node/NumericNode;
    at com.github.fge.jsonschema.keyword.digest.helpers.NumericDigester.digestedNumberNode(NumericDigester.java:77)
    at com.github.fge.jsonschema.keyword.digest.common.MaximumDigester.digest(MaximumDigester.java:50)
    at com.github.fge.jsonschema.processors.digest.SchemaDigester.buildDigests(SchemaDigester.java:96)
    at com.github.fge.jsonschema.processors.digest.SchemaDigester.process(SchemaDigester.java:82)
    at com.github.fge.jsonschema.processors.digest.SchemaDigester.process(SchemaDigester.java:47)
    at com.github.fge.jsonschema.core.processing.ProcessorChain$ProcessorMerger.process(ProcessorChain.java:189)
    at com.github.fge.jsonschema.core.processing.ProcessorChain$ProcessorMerger.process(ProcessorChain.java:189)
    at com.github.fge.jsonschema.core.processing.ProcessingResult.of(ProcessingResult.java:79)
    at com.github.fge.jsonschema.core.processing.CachingProcessor$1.load(CachingProcessor.java:128)
    at com.github.fge.jsonschema.core.processing.CachingProcessor$1.load(CachingProcessor.java:120)
    at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3542)
    at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2323)
    at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2286)
    at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2201)
    ... 79 more

From internal conversation:

It seems it can be some #3744-related. In that commit we nailed jackson-databind to 2.9.3, but only for Kinesis. Snowplow Mini uses PubSub and therefore in dependencies we have:

com.fasterxml.jackson.core:jackson-databind:2.6.7

However:

  • In 2.6.7 we have numberNode(v: BigInteger): NumericNode
  • In 2.9.3 we have numberNode(v: BigInteger): ValueNode
    • Clearly they're binary-incompatible, but NoSuchMethodError tells us about missing numberNode(v: BigInteger): NumericNode
    • Which might mean that problem is somewhere else

Snowplow Mini 0.4.0 (Stream Enrich 0.12.0) worked fine with these big integers.

chuwy commented 6 years ago

Actually, just found out that we already have possible fix: https://github.com/snowplow/snowplow/issues/3767

BenFradet commented 6 years ago

this is just a qa check with #174

BenFradet commented 6 years ago

were you able to check this @oguzhanunlu ?

oguzhanunlu commented 6 years ago

No I haven't tested this @BenFradet

chuwy commented 6 years ago

Could you please do it @oguzhanunlu, before starting with Iglu sprint. Just close if test is successful

oguzhanunlu commented 6 years ago

Hey @BenFradet @chuwy , I checked and we get this error for 10000000000000000000000000000000000000000000000000.

chuwy commented 6 years ago

That is quite sad. I think we'll have to prioritize JSON Schema validator refactoring in Iglu Client, I believe problem is there.

oguzhanunlu commented 6 years ago

Since this isn't directly related to Snowplow Mini, I removed this issue from the release milestone.

benjben commented 4 years ago

The last version to contain

public NumericNode numberNode(BigInteger v)

is 2.8.11.6. From 2.9.0 this function becomes

public ValueNode numberNode(BigInteger v)

So we do have a binary incompatibility in Stream Enrich jar, probably related to https://github.com/snowplow/snowplow/issues/3744.