everit-org / json-schema

JSON Schema validator for java, based on the org.json API
Apache License 2.0
864 stars 282 forks source link

GC thrashing observed on collecting validation failures #264

Closed jklukas closed 5 years ago

jklukas commented 5 years ago

This is probably not easy to take action on, but wanted to at least make you aware of the failure mode. I saw a service die due to GC thrashing that eventually traced back to this library attempting to collect a large number of validation failures. It may be that the current method of calling ValidationException.prepend to create a new ValidationException for each failing node becomes overly expensive in terms of creating new objects.

Unfortunately, I was not able to recover the JSON payload that triggered the behavior, but the service was able to start up cleanly and continue working after I switched on the failEarly() option of the Validator.

Feel free to go ahead and close this if you don't feel like there's any follow-up that can be done without having the JSON document that triggered the behavior.

Originating issue: https://github.com/mozilla/gcp-ingestion/issues/374

Stack trace Processing stuck in step parsePayload/ParMultiDo(DoFnWithErrors) for at least 05m14s without outputting or completing in state process at java.lang.Throwable.fillInStackTrace(Native Method) at java.lang.Throwable.fillInStackTrace(Throwable.java:783) at java.lang.Throwable.(Throwable.java:265) at java.lang.Exception.(Exception.java:66) at java.lang.RuntimeException.(RuntimeException.java:62) at org.everit.json.schema.ValidationException.(ValidationException.java:264) at org.everit.json.schema.ValidationException.(ValidationException.java:288) at org.everit.json.schema.ValidationException.prepend(ValidationException.java:398) at org.everit.json.schema.ValidationException.prepend(ValidationException.java:378) at org.everit.json.schema.ValidationException.lambda$prepend$3(ValidationException.java:396) at org.everit.json.schema.ValidationException$$Lambda$412/60386056.apply(Unknown Source) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1376) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) at org.everit.json.schema.ValidationException.prepend(ValidationException.java:397) at org.everit.json.schema.ValidationException.prepend(ValidationException.java:378) at org.everit.json.schema.ValidationException.lambda$prepend$3(ValidationException.java:396) at org.everit.json.schema.ValidationException$$Lambda$412/60386056.apply(Unknown Source) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1376) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) at org.everit.json.schema.ValidationException.prepend(ValidationException.java:397) at org.everit.json.schema.ValidationException.prepend(ValidationException.java:378) at org.everit.json.schema.ObjectSchemaValidatingVisitor.visitPropertySchema(ObjectSchemaValidatingVisitor.java:159) at org.everit.json.schema.Visitor.visitObjectSchema(Visitor.java:131) at org.everit.json.schema.ObjectSchemaValidatingVisitor.visitObjectSchema(ObjectSchemaValidatingVisitor.java:35) at org.everit.json.schema.ObjectSchema.accept(ObjectSchema.java:266) at org.everit.json.schema.ValidatingVisitor.visitObjectSchema(ValidatingVisitor.java:113) at org.everit.json.schema.ObjectSchema.accept(ObjectSchema.java:266) at org.everit.json.schema.Visitor.visit(Visitor.java:43) at org.everit.json.schema.ValidatingVisitor.visit(ValidatingVisitor.java:30) at org.everit.json.schema.ValidatingVisitor.lambda$getFailureOfSchema$0(ValidatingVisitor.java:153) at org.everit.json.schema.ValidatingVisitor$$Lambda$354/264709280.run(Unknown Source) at org.everit.json.schema.ValidationFailureReporter.inContextOfSchema(ValidationFailureReporter.java:34) at org.everit.json.schema.CollectingFailureReporter.inContextOfSchema(CollectingFailureReporter.java:25) at org.everit.json.schema.ValidatingVisitor.getFailureOfSchema(ValidatingVisitor.java:153) at org.everit.json.schema.ObjectSchemaValidatingVisitor.visitPropertySchema(ObjectSchemaValidatingVisitor.java:157) at org.everit.json.schema.Visitor.visitObjectSchema(Visitor.java:131) at org.everit.json.schema.ObjectSchemaValidatingVisitor.visitObjectSchema(ObjectSchemaValidatingVisitor.java:35) at org.everit.json.schema.ObjectSchema.accept(ObjectSchema.java:266) at org.everit.json.schema.ValidatingVisitor.visitObjectSchema(ValidatingVisitor.java:113) at org.everit.json.schema.ObjectSchema.accept(ObjectSchema.java:266) at org.everit.json.schema.Visitor.visit(Visitor.java:43) at org.everit.json.schema.ValidatingVisitor.visit(ValidatingVisitor.java:30) at org.everit.json.schema.ValidatingVisitor.lambda$getFailureOfSchema$0(ValidatingVisitor.java:153) at org.everit.json.schema.ValidatingVisitor$$Lambda$354/264709280.run(Unknown Source) at org.everit.json.schema.ValidationFailureReporter.inContextOfSchema(ValidationFailureReporter.java:34) at org.everit.json.schema.CollectingFailureReporter.inContextOfSchema(CollectingFailureReporter.java:25) at org.everit.json.schema.ValidatingVisitor.getFailureOfSchema(ValidatingVisitor.java:153) at org.everit.json.schema.ObjectSchemaValidatingVisitor.visitPropertySchema(ObjectSchemaValidatingVisitor.java:157) at org.everit.json.schema.Visitor.visitObjectSchema(Visitor.java:131) at org.everit.json.schema.ObjectSchemaValidatingVisitor.visitObjectSchema(ObjectSchemaValidatingVisitor.java:35) at org.everit.json.schema.ObjectSchema.accept(ObjectSchema.java:266) at org.everit.json.schema.ValidatingVisitor.visitObjectSchema(ValidatingVisitor.java:113) at org.everit.json.schema.ObjectSchema.accept(ObjectSchema.java:266) at org.everit.json.schema.Visitor.visit(Visitor.java:43) at org.everit.json.schema.ValidatingVisitor.visit(ValidatingVisitor.java:30) at org.everit.json.schema.ValidatingVisitor.lambda$getFailureOfSchema$0(ValidatingVisitor.java:153) at org.everit.json.schema.ValidatingVisitor$$Lambda$354/264709280.run(Unknown Source) at org.everit.json.schema.ValidationFailureReporter.inContextOfSchema(ValidationFailureReporter.java:34) at org.everit.json.schema.CollectingFailureReporter.inContextOfSchema(CollectingFailureReporter.java:25) at org.everit.json.schema.ValidatingVisitor.getFailureOfSchema(ValidatingVisitor.java:153) at org.everit.json.schema.ObjectSchemaValidatingVisitor.visitPropertySchema(ObjectSchemaValidatingVisitor.java:157) at org.everit.json.schema.Visitor.visitObjectSchema(Visitor.java:131) at org.everit.json.schema.ObjectSchemaValidatingVisitor.visitObjectSchema(ObjectSchemaValidatingVisitor.java:35) at org.everit.json.schema.ObjectSchema.accept(ObjectSchema.java:266) at org.everit.json.schema.ValidatingVisitor.visitObjectSchema(ValidatingVisitor.java:113) at org.everit.json.schema.ObjectSchema.accept(ObjectSchema.java:266) at org.everit.json.schema.Visitor.visit(Visitor.java:43) at org.everit.json.schema.ValidatingVisitor.visit(ValidatingVisitor.java:30) at org.everit.json.schema.DefaultValidator.performValidation(Validator.java:53) at org.everit.json.schema.Schema.validate(Schema.java:126) at com.mozilla.telemetry.decoder.ParsePayload.processElement(ParsePayload.java:146) at com.mozilla.telemetry.decoder.ParsePayload.processElement(ParsePayload.java:30) at com.mozilla.telemetry.transforms.MapElementsWithErrors$DoFnWithErrors.processElementOrError(MapElementsWithErrors.java:88) at com.mozilla.telemetry.transforms.MapElementsWithErrors$DoFnWithErrors$DoFnInvoker.invokeProcessElement(Unknown Source)
erosb commented 5 years ago

Hello @jklukas ,

thanks for getting in touch. In cases when there are "too many" validation failures the collecting failure mode can crash by its nature. Even if we optimize the failure propagation process, it can still fail, we can only adjust the treshold. This is why the fail-early mode has been introduced for production environments.

One thing that may be useful is to have an "intermediate" mode, which is collecting the failures, but can have a limit of maximum number of failures collected, so users would have a reporting mode which is more verbose then fail-early but still has a safety net to avoid memory problems.

jklukas commented 5 years ago

I agree that intermediate mode would be nice, but failEarly is sufficient for my needs. I was going to suggest a documentation update to better advertise failEarly, but reading over the README again, it seems pretty clear.

Thanks for your work on this library!