apache / druid

Apache Druid: a high performance real-time analytics database.
https://druid.apache.org/
Apache License 2.0
13.45k stars 3.7k forks source link

convertSpec tool completely broken #1560

Closed pdeva closed 6 years ago

pdeva commented 9 years ago

I am using convertSpec tool in 0.6.173 to upgrade my realtime node specfile for 0.7.

Here is what the file looks like

[
  {
    "schema": {
      "dataSource": "dripstat",
      "aggregators": [
        {
          "type": "count",
          "name": "count"
        },
        {
          "type": "doubleSum",
          "name": "min",
          "fieldName": "min"
        },
        {
          "type": "doubleSum",
          "name": "max",
          "fieldName": "max"
        },
        {
          "type": "longSum",
          "name": "callcount",
          "fieldName": "callcount"
        },
        {
          "type": "longSum",
          "name": "errcount",
          "fieldName": "errcount"
        },
        {
          "type": "doubleSum",
          "name": "totaltime",
          "fieldName": "totaltime"
        },
        {
          "type": "doubleSum",
          "name": "value",
          "fieldName": "value"
        },
        {
          "type": "doubleSum",
          "name": "traceduration",
          "fieldName": "traceduration"
        }
      ],
      "indexGranularity": "none",
      "shardSpec": {
        "type": "linear",
        "partitionNum": 0
      }
    },
    "config": {
      "maxRowsInMemory": 50000,
      "intermediatePersistPeriod": "PT10m"
    },
    "firehose": {
      "type": "kafka-0.8",
      "consumerProps": {
        "zookeeper.connect": "xx",
        "zookeeper.connection.timeout.ms": "15000",
        "zookeeper.session.timeout.ms": "15000",
        "zookeeper.synctime.ms": "5000",
        "group.id": "druid-dripstat",
        "fetch.size": "1048586",
        "auto.offset.reset": "largest",
        "auto.commit.enable": "false"
      },
      "feed": "dripstat",
      "parser": {
        "timestampSpec": {
          "column": "timestamp"
        },
        "data": {
          "format": "json"
        }
      }
    },
    "plumber": {
      "type": "realtime",
      "windowPeriod": "PT10m",
      "segmentGranularity": "hour",
      "basePersistDirectory": "/data/realtime/basePersist"
    }
  }
]

Command used to run tool:

java -Duser.timezone=UTC -Dfile.encoding=UTF-8  -classpath 'lib/*' io.druid.cli.Main tools convertSpec -o ~/druid/dripstatspec.json -n ~/druid/newspec.json -t standalone_realtime

I get this error:

Exception in thread "main" java.lang.RuntimeException: com.fasterxml.jackson.databind.JsonMappingException: Can not deserialize instance of io.druid.segment.realtime.FireDepartment out of START_ARRAY token
 at [Source: /Users/pdeva/druid/dripstatspec.json; line: 1, column: 1]
    at com.google.api.client.repackaged.com.google.common.base.Throwables.propagate(Throwables.java:160)
    at io.druid.cli.convert.ConvertIngestionSpec.run(ConvertIngestionSpec.java:70)
    at io.druid.cli.Main.main(Main.java:90)
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Can not deserialize instance of io.druid.segment.realtime.FireDepartment out of START_ARRAY token
 at [Source: /Users/pdeva/druid/dripstatspec.json; line: 1, column: 1]
    at com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:164)
    at com.fasterxml.jackson.databind.DeserializationContext.mappingException(DeserializationContext.java:575)
    at com.fasterxml.jackson.databind.DeserializationContext.mappingException(DeserializationContext.java:569)
    at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromArray(BeanDeserializerBase.java:1121)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:148)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:123)
    at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:2888)
    at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:1988)
    at io.druid.cli.convert.ConvertIngestionSpec$StandaloneRealtimeIngestionSchemaConverter.convert(ConvertIngestionSpec.java:92)
    at io.druid.cli.convert.ConvertIngestionSpec$StandaloneRealtimeIngestionSchemaConverter.convert(ConvertIngestionSpec.java:87)
    at io.druid.cli.convert.ConvertIngestionSpec.run(ConvertIngestionSpec.java:67)
    ... 1 more

So, I removed the array from the top and bottom of the .json file since there is only 1 element anyway. But now I run into this:

xception in thread "main" java.lang.RuntimeException: com.fasterxml.jackson.databind.JsonMappingException: Could not resolve type id 'linear' into a subtype of [simple type, class io.druid.timeline.partition.ShardSpec]
 at [Source: /Users/pdeva/druid/dripstatspec.json; line: 48, column: 9]
    at com.google.api.client.repackaged.com.google.common.base.Throwables.propagate(Throwables.java:160)
    at io.druid.cli.convert.ConvertIngestionSpec.run(ConvertIngestionSpec.java:70)
    at io.druid.cli.Main.main(Main.java:90)
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Could not resolve type id 'linear' into a subtype of [simple type, class io.druid.timeline.partition.ShardSpec]
 at [Source: /Users/pdeva/druid/dripstatspec.json; line: 48, column: 9]
    at com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:164)
    at com.fasterxml.jackson.databind.DeserializationContext.unknownTypeException(DeserializationContext.java:677)
    at com.fasterxml.jackson.databind.jsontype.impl.TypeDeserializerBase._findDeserializer(TypeDeserializerBase.java:158)
    at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:99)
    at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:82)
    at com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:106)
    at com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:462)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:347)
    at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:977)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:276)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:121)
    at com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:464)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:347)
    at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:977)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:276)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:121)
    at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:2888)
    at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:1988)
    at io.druid.cli.convert.ConvertIngestionSpec$StandaloneRealtimeIngestionSchemaConverter.convert(ConvertIngestionSpec.java:92)
    at io.druid.cli.convert.ConvertIngestionSpec$StandaloneRealtimeIngestionSchemaConverter.convert(ConvertIngestionSpec.java:87)
    at io.druid.cli.convert.ConvertIngestionSpec.run(ConvertIngestionSpec.java:67)
    ... 1 more

The line that is causing this issue is the 'type' key here:

"shardSpec": {
        "type": "linear",
        "partitionNum": 0
      }

So, I went ahead and removed the whole shardSpec element from my file. But now I run into this:

Exception in thread "main" java.lang.RuntimeException: com.fasterxml.jackson.databind.JsonMappingException: Could not resolve type id 'kafka-0.8' into a subtype of [simple type, class io.druid.data.input.FirehoseFactory]
 at [Source: /Users/pdeva/druid/dripstatspec.json; line: 53, column: 7]
    at com.google.api.client.repackaged.com.google.common.base.Throwables.propagate(Throwables.java:160)
    at io.druid.cli.convert.ConvertIngestionSpec.run(ConvertIngestionSpec.java:70)
    at io.druid.cli.Main.main(Main.java:90)
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Could not resolve type id 'kafka-0.8' into a subtype of [simple type, class io.druid.data.input.FirehoseFactory]
 at [Source: /Users/pdeva/druid/dripstatspec.json; line: 53, column: 7]
    at com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:164)
    at com.fasterxml.jackson.databind.DeserializationContext.unknownTypeException(DeserializationContext.java:677)
    at com.fasterxml.jackson.databind.jsontype.impl.TypeDeserializerBase._findDeserializer(TypeDeserializerBase.java:158)
    at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:99)
    at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:82)
    at com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:106)
    at com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:462)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:347)
    at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:977)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:276)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:121)
    at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:2888)
    at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:1988)
    at io.druid.cli.convert.ConvertIngestionSpec$StandaloneRealtimeIngestionSchemaConverter.convert(ConvertIngestionSpec.java:92)
    at io.druid.cli.convert.ConvertIngestionSpec$StandaloneRealtimeIngestionSchemaConverter.convert(ConvertIngestionSpec.java:87)
    at io.druid.cli.convert.ConvertIngestionSpec.run(ConvertIngestionSpec.java:67)
    ... 1 more

It seems this tool is completely broken.

fjy commented 9 years ago

There's some Guice problems with resolving shardSpec types for the tool. One workaround is to remove the shardSpec and add it in manually after the conversion.

pdeva commented 9 years ago

As noted in the bug report, I did remove the shardSpec and ran into further errors

fjy commented 9 years ago

@pdeva You'll have to include the kafka 8 extension for the tool to be able to resolve the type. I think we will need more tests around the tool. You can also read http://druid.io/docs/latest/ingestion/realtime-ingestion.html and manually convert the spec.

himanshug commented 9 years ago

@pdeva did it work after adding kafka-8 extension to classpath? Also, for the shard spec issue can you try to replace https://github.com/druid-io/druid/blob/0.6.x/services/src/main/java/io/druid/cli/convert/ConvertIngestionSpec.java#L62 with

   Injector injector = Initialization.makeInjectorWithModules(
        GuiceInjectors.makeStartupInjector(),
        ImmutableList.<Module>of(
            new Module()
            {
              @Override
              public void configure(Binder binder)
              {
                JsonConfigProvider.bindInstance(
                    binder, Key.get(DruidNode.class, Self.class), new DruidNode("dummy", null, null)
                );
              }
            }
        )
    );
    ObjectMapper jsonMapper = injector.getInstance(ObjectMapper.class);

and see if that fixes the issue?

pdeva commented 9 years ago

how does one add kafka extension to classpath? there is no documentation for this. in druid 0.6 we add the kafka extension as a maven style dependency in runtime.properties file. since runtime.properties is not being used here, how do i add the extension?

vogievetsky commented 6 years ago

Is this still an issue?

pdeva commented 6 years ago

i dont think this was ever fixed

gianm commented 6 years ago

It wasn't fixed, but we did remove the tool at some point since everything involved here is deprecated now! So I will close the issue.