jcustenborder / kafka-connect-json-schema

Apache License 2.0
14 stars 14 forks source link

NPE when using schema with nested array #17

Open pkleindl opened 2 years ago

pkleindl commented 2 years ago

Hi

The SMT works fine when I exclude the nested array from the definition but when I add it I get the Exception below. I am using the schema inline with validation set to false "transforms.fromJson.type": "com.github.jcustenborder.kafka.connect.json.FromJson$Value", "transforms.fromJson.json.schema.location": "Inline", "json.schema.validation.enabled": false,

Schema (simplified):

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "type": "object",
  "properties": {
    "device": {
      "type": "string"
    },
    "BaseStations": {
      "type": "array",
      "items": [
        {
          "type": "string"
        }
      ]
    }
  }
}

Exception kafka-connect | Caused by: java.lang.NullPointerException kafka-connect | at com.github.jcustenborder.kafka.connect.json.FromJsonSchemaConverterFactory.fromJSON(FromJsonSchemaConverterFactory.java:97) kafka-connect | at com.github.jcustenborder.kafka.connect.json.FromJsonSchemaConverterFactory.fromJSON(FromJsonSchemaConverterFactory.java:68) kafka-connect | at com.github.jcustenborder.kafka.connect.json.FromJsonSchemaConverter$ArraySchemaConverter.schemaBuilder(FromJsonSchemaConverter.java:381) kafka-connect | at com.github.jcustenborder.kafka.connect.json.FromJsonSchemaConverter$ArraySchemaConverter.schemaBuilder(FromJsonSchemaConverter.java:373) kafka-connect | at com.github.jcustenborder.kafka.connect.json.FromJsonSchemaConverterFactory.fromJSON(FromJsonSchemaConverterFactory.java:114) kafka-connect | at com.github.jcustenborder.kafka.connect.json.FromJsonSchemaConverter$ObjectSchemaConverter.lambda$fromJSON$2(FromJsonSchemaConverter.java:135) kafka-connect | at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) kafka-connect | at java.base/java.util.ArrayList.forEach(ArrayList.java:1541) kafka-connect | at java.base/java.util.stream.SortedOps$RefSortingSink.end(SortedOps.java:395) kafka-connect | at java.base/java.util.stream.Sink$ChainedReference.end(Sink.java:258) kafka-connect | at java.base/java.util.stream.Sink$ChainedReference.end(Sink.java:258) kafka-connect | at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:485) kafka-connect | at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) kafka-connect | at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) kafka-connect | at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) kafka-connect | at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) kafka-connect | at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497) kafka-connect | at com.github.jcustenborder.kafka.connect.json.FromJsonSchemaConverter$ObjectSchemaConverter.fromJSON(FromJsonSchemaConverter.java:130) kafka-connect | at com.github.jcustenborder.kafka.connect.json.FromJsonSchemaConverter$ObjectSchemaConverter.fromJSON(FromJsonSchemaConverter.java:89) kafka-connect | at com.github.jcustenborder.kafka.connect.json.FromJsonSchemaConverterFactory.fromJSON(FromJsonSchemaConverterFactory.java:128) kafka-connect | at com.github.jcustenborder.kafka.connect.json.FromJsonSchemaConverterFactory.fromJSON(FromJsonSchemaConverterFactory.java:68) kafka-connect | at com.github.jcustenborder.kafka.connect.json.FromJson.configure(FromJson.java:157) kafka-connect | at org.apache.kafka.connect.runtime.ConnectorConfig.transformations(ConnectorConfig.java:285) kafka-connect | ... 10 more

If I have time I will try to set up a test to reproduce it but any help would be appreciated.

best regards & great tool Patrik

arijitmazumdar commented 1 year ago

Try to make the array of type to a non array, that should solve this issue. Try something like:

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "type": "object",
  "properties": {
    "device": {
      "type": "string"
    },
    "BaseStations": {
      "type": "array",
      "items": 
        {
          "type": "string"
        }
    }
  }
}

As far jsonschema definition goes, whatever you have done is supported. But probably this is a bug coming from internal library, which is quite old.

I am trying to build this connector with latest 1.5.1 version of org.everit.json.schema, with a hope that might resolve this issue.

This can be a reference - https://github.com/arijitmazumdar/kafka-connect-docker/blob/main/file-source.json