confluentinc / ksql

The database purpose-built for stream processing applications.
https://ksqldb.io
Other
128 stars 1.04k forks source link

Stack overflow exception when creating a stream with Big/Complex Protobuf definition #5266

Closed lang1991 closed 3 years ago

lang1991 commented 4 years ago

Describe the bug A clear and concise description of what the bug is. java.lang.StackOverflowError exception when creating a KSQL Protobuf stream with a big/complex Protobuf definition. We have a pretty huge Protobuf definition and it is about 80/90 MB when exported to a Protobuf File Descriptor Set. I was able to register it just fine in Schema Registry, but was unable to create a KSQL stream from it. The tricky thing is I was advised against sharing it by my employer, so I can't provide the entire definition.

To Reproduce Steps to reproduce the behavior, include:

  1. The version of KSQL. 5.5.0-post
  2. Sample source data. Protobuf definition not available, but a very nested Protobuf with many fields should achieve the same effect
  3. Any SQL statements you ran create stream records with (KAFKA_TOPIC='records', VALUE_FORMAT='PROTOBUF');

Expected behavior A clear and concise description of what you expected to happen. The stream should be created just fine like any simpler Protobuf definitions supported

Actual behaviour A clear and concise description of what actually happens, including:

  1. CLI output
    ksql> create stream payment_records with (KAFKA_TOPIC='payment_records', VALUE_FORMAT='PROTOBUF');
    java.lang.StackOverflowError
  2. Error messages No further error messages
  3. KSQL logs Unable to find anything in the logs of ksqldb-server

Additional context Add any other context about the problem here.

Since debugging this will be hard for the community because I cannot provide the full Protobuf definition. Any advice where I can start my own investigation? For example, some files I should check? My first guess is maybe we use recursive DFS to resolve the dependencies of the top level Protobuf, and it grew too deep to blow the stack? However this is strange as Schema Registry can work with the schema just fine.

apurvam commented 4 years ago

@agavra can you help triage this?

rayokota commented 4 years ago

This may be a problem in the Protobuf converter with recursive definitions, which I'm investigating.

rayokota commented 4 years ago

Possibly fixed by https://github.com/confluentinc/schema-registry/pull/1496

agavra commented 4 years ago

@lang1991 - given we don't have your schema to reproduce the bug, and that @rayokota has fixed a similar issue in schema-registry I'm going to close out this ticket for now. After 0.11 is released, if your problem persists, please reopen this ticket and we can take another look.

mikebin commented 3 years ago

This appears to still be an issue even in ksqlDB 0.20.0. A StackOverflowError occurs with the following recursive Protobuf schema:

syntax = "proto3";

message Event {
  string id = 1;
  MapValue payload = 2;
}

message MapValue {
  map<string, Value> properties = 1;
}

message Value {
  oneof kind {
    int64 integer_value = 1;
    double float_value = 2;
    string string_value = 3;
    bool bool_value = 4;
    MapValue map_value = 5;
  }
}

And the following stream definition which uses it:

create stream recursive with (kafka_topic='rec', value_format='protobuf', partitions=1);

No errors or warnings appear in the ksqlDB server log.

vcrfxia commented 3 years ago

Hey @mikebin IIUC, the original report was for problems caused by a large protobuf definition, while the new issue is for a recursive protobuf definition. ksqlDB does not support recursive types. We have an existing ticket for Avro schemas with recursive defintions: https://github.com/confluentinc/ksql/issues/2814. I'm going to close this ticket and update the other one to clarify that recursive schemas are not supported for protobuf either.

mikebin commented 3 years ago

Thanks for the correction @vcrfxia

vcrfxia commented 3 years ago

Actually, I just noticed that the existing issue I linked it for a recursive type, rather than a recursive schema. I've filed a new issue instead (to distinguish from this issue which was originally for a large proto definition, rather than a recursive one): https://github.com/confluentinc/ksql/issues/8156