confluentinc / ksql

The database purpose-built for stream processing applications.
https://ksqldb.io
Other
118 stars 1.04k forks source link

Stackoverflow inside ksqldb server node when consuming #7392

Closed haf closed 3 years ago

haf commented 3 years ago

Describe the bug

ksqlDB crashes for simple JSON schemas.

Create a topic. Associate it with the JSON schema from https://app.logary.tech/schemas/event-message.schema.json via Schema Registry. Produce JSON to that topic using that schema. All fine so far.

Now create a ksqlDB stream using that topic:

CREATE STREAM mystream with (kafka_topic = 'analytics_event', value_format = 'json_sr');

Log output:

ksqldb-5b9fd7fd6-fvg6v ksqldb CREATE STREAM mystream with (kafka_topic = 'analytics_event', value_format = 'json_sr');', configOverrides={}, requestProperties={}, commandSequenceNumber=Optional.empty} (io.confluent.ksql.rest.server.resources.KsqlResource:244)
ksqldb-5b9fd7fd6-fvg6v ksqldb [2021-04-18 16:43:21,693] INFO KsqlConfig values:
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.access.validator.enable = auto
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.authorization.cache.expiry.time.secs = 30
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.authorization.cache.max.entries = 10000
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.cast.strings.preserve.nulls = true
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.connect.url = http://ksqldb-connect-api.app.svc.cluster.local:8083
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.connect.worker.config =
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.create.or.replace.enabled = true
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.error.classifier.regex =
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.extension.dir = ext
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.hidden.topics = [_confluent.*, __confluent.*, _schemas, __consumer_offsets, __transaction_state, connect-configs, connect-offsets, connect-status, connect-statuses]
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.insert.into.values.enabled = true
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.internal.topic.min.insync.replicas = 1
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.internal.topic.replicas = 3
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.metastore.backup.location =
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.metrics.extension = null
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.metrics.tags.custom =
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.output.topic.name.prefix =
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.persistence.default.format.key = KAFKA
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.persistence.default.format.value = null
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.persistence.wrap.single.values = null
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.persistent.prefix = query_
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.properties.overrides.denylist = []
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.pull.queries.enable = true
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.query.error.max.queue.size = 10
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.query.persistent.active.limit = 2147483647
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.query.persistent.max.bytes.buffering.total = -1
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.query.pull.enable.standby.reads = false
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.query.pull.max.allowed.offset.lag = 9223372036854775807
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.query.pull.max.qps = 2147483647
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.query.pull.metrics.enabled = true
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.query.pull.thread.pool.size = 100
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.query.retry.backoff.initial.ms = 15000
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.query.retry.backoff.max.ms = 900000
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.query.status.running.threshold.seconds = 300
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.query.transient.max.bytes.buffering.total = -1
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.readonly.topics = [_confluent.*, __confluent.*, _schemas, __consumer_offsets, __transaction_state, connect-configs, connect-offsets, connect-status, connect-statuses]
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.schema.registry.url = http://kafka-schemas.kafka.svc.cluster.local
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.security.extension.class = null
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.service.id = ksql_app_
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.sink.window.change.log.additional.retention = 1000000
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.streams.shutdown.timeout.ms = 300000
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.suppress.buffer.size.bytes = -1
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.suppress.enabled = false
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.timestamp.throw.on.invalid = false
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.transient.prefix = transient_
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.udf.collect.metrics = false
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.udf.enable.security.manager = true
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.udfs.enabled = true
ksqldb-5b9fd7fd6-fvg6v ksqldb   ksql.variable.substitution.enable = true
ksqldb-5b9fd7fd6-fvg6v ksqldb   metric.reporters = []
ksqldb-5b9fd7fd6-fvg6v ksqldb   ssl.cipher.suites = null
ksqldb-5b9fd7fd6-fvg6v ksqldb   ssl.enabled.protocols = [TLSv1.2, TLSv1.3]
ksqldb-5b9fd7fd6-fvg6v ksqldb   ssl.endpoint.identification.algorithm = https
ksqldb-5b9fd7fd6-fvg6v ksqldb   ssl.engine.factory.class = null
ksqldb-5b9fd7fd6-fvg6v ksqldb   ssl.key.password = null
ksqldb-5b9fd7fd6-fvg6v ksqldb   ssl.keymanager.algorithm = SunX509
ksqldb-5b9fd7fd6-fvg6v ksqldb   ssl.keystore.certificate.chain = null
ksqldb-5b9fd7fd6-fvg6v ksqldb   ssl.keystore.key = null
ksqldb-5b9fd7fd6-fvg6v ksqldb   ssl.keystore.location = null
ksqldb-5b9fd7fd6-fvg6v ksqldb   ssl.keystore.password = null
ksqldb-5b9fd7fd6-fvg6v ksqldb   ssl.keystore.type = JKS
ksqldb-5b9fd7fd6-fvg6v ksqldb   ssl.protocol = TLSv1.3
ksqldb-5b9fd7fd6-fvg6v ksqldb   ssl.provider = null
ksqldb-5b9fd7fd6-fvg6v ksqldb   ssl.secure.random.implementation = null
ksqldb-5b9fd7fd6-fvg6v ksqldb   ssl.trustmanager.algorithm = PKIX
ksqldb-5b9fd7fd6-fvg6v ksqldb   ssl.truststore.certificates = null
ksqldb-5b9fd7fd6-fvg6v ksqldb   ssl.truststore.location = null
ksqldb-5b9fd7fd6-fvg6v ksqldb   ssl.truststore.password = null
ksqldb-5b9fd7fd6-fvg6v ksqldb   ssl.truststore.type = JKS
ksqldb-5b9fd7fd6-fvg6v ksqldb  (io.confluent.ksql.util.KsqlConfig:372)
ksqldb-5b9fd7fd6-fvg6v ksqldb [2021-04-18 16:43:21,742] INFO JsonSchemaDataConfig values:
ksqldb-5b9fd7fd6-fvg6v ksqldb   decimal.format = BASE64
ksqldb-5b9fd7fd6-fvg6v ksqldb   schemas.cache.size = 1000
ksqldb-5b9fd7fd6-fvg6v ksqldb  (io.confluent.connect.json.JsonSchemaDataConfig:372)
ksqldb-5b9fd7fd6-fvg6v ksqldb [2021-04-18 16:43:21,822] ERROR 127.0.0.1 - - [Sun, 18 Apr 2021 16:43:21 GMT] "POST /ksql HTTP/1.1" 500 85 "-" "Java/14.0.2" 124 (io.confluent.ksql.api.server.LoggingHandler:109)

Output from the request:

java.lang.StackOverflowError (50000)

Expected behavior

  1. It should log where it crashes in the server node, but it does not
  2. It should log similar stacktrace info to the response, but it does not
  3. It should not crash
  4. It should not crash with stack overflow errors
  5. It should validate the JSON schema in Schema Registry

Additional context

      containers:
      - name: ksqldb
        image: confluentinc/ksqldb-server:0.15.0
      - name: schema-registry
        image: confluentinc/cp-schema-registry:6.0.0
spena commented 3 years ago

@haf I tried to reproduce this in the latest master branch, but I got the following error:

ksql> CREATE STREAM mystream with (kafka_topic = 'analytics_event2', value_format = 'json_sr');
>
>
Unable to verify if the value schema for topic analytics_event2 is compatible with ksqlDB.
Reason: Unsupported criterion allOf for {"title":"EventMessage","description":"A Logary event message","$id":"https://app.logary.tech/schemas/event-message.schema.json","$schema":"http://json-schema.org/draft-07/schema#","allOf":[{"$ref":"https://app.logary.tech/schemas/logary-message-base.schema.json"}],"type":"object","required":["type","event"],"properties":{"monetaryValue":{"$ref":"https://app.logary.tech/schemas/definitions.schema.json#/definitions/money","description":"How much was lost or gained as a result of this event happening?"},"type":{"description":"Discriminator","const":"event","type":"string"},"event":{"type":"string","maxLength":1024,"description":"What happened?","examples":["User clicked \"{cssSelector}\"","User {user.name} logged in","Application error: {error.message}","Unhandled JS error","Cart checkout"]},"error":{"$ref":"https://app.logary.tech/schemas/definitions.schema.json#/definitions/error-info","description":"Optional error associated with this event."}}}

Please see https://github.com/confluentinc/ksql/issues/ to see if this particular reason is already known.
If not, please log a new issue, including this full error message.
Schema:{"$schema":"http://json-schema.org/draft-07/schema#","$id":"https://app.logary.tech/schemas/event-message.schema.json","allOf":[{"$ref":"https://app.logary.tech/schemas/logary-message-base.schema.json"}],"title":"EventMessage","description":"A Logary event message","type":"object","properties":{"type":{"description":"Discriminator","type":"string","const":"event"},"event":{"description":"What happened?","type":"string","maxLength":1024,"examples":["User clicked \"{cssSelector}\"","User {user.name} logged in","Application error: {error.message}","Unhandled JS error","Cart checkout"]},"monetaryValue":{"description":"How much was lost or gained as a result of this event happening?","$ref":"https://app.logary.tech/schemas/definitions.schema.json#/definitions/money"},"error":{"description":"Optional error associated with this event.","$ref":"https://app.logary.tech/schemas/definitions.schema.json#/definitions/error-info"}},"required":["type","event"]}

Could you try with the latest release if you get the same error? or is there something I'm missing in the schema?

haf commented 3 years ago

Hi,

The schema is very valid- I spent a lot of time on it so it would work well with Kafka and validated it as part of that.

It’s hard to know what ksqldb doesn’t like about it. It’s just a discriminated union of five types of messages.

At least you found another bug then (or missing feature?).

Henrik

On Thursday, May 06, 2021 at 10:48 PM, Sergio Peña @. @.)> wrote:

@haf (https://github.com/haf) I tried to reproduce this in the latest master branch, but I got the following error:

ksql> CREATE STREAM mystream with (kafka_topic = 'analytics_event2', value_format = 'json_sr'); > > Unable to verify if the value schema for topic analytics_event2 is compatible with ksqlDB. Reason: Unsupported criterion allOf for {"title":"EventMessage","description":"A Logary event message","$id":"https://app.logary.tech/schemas/event-message.schema.json","$schema":"http://json-schema.org/draft-07/schema#","allOf":[{"$ref":"https://app.logary.tech/schemas/logary-message-base.schema.json"}],"type":"object","required":["type","event"],"properties":{"monetaryValue":{"$ref":"https://app.logary.tech/schemas/definitions.schema.json#/definitions/money","description":"How much was lost or gained as a result of this event happening?"},"type":{"description":"Discriminator","const":"event","type":"string"},"event":{"type":"string","maxLength":1024,"description":"What happened?","examples":["User clicked \"{cssSelector}\"","User {user.name} logged in","Application error: {error.message}","Unhandled JS error","Cart checkout"]},"error":{"$ref":"https://app.logary.tech/schemas/definitions.schema.json#/definitions/error-info","description":"Optional error associated with this event."}}} Please see https://github.com/confluentinc/ksql/issues/ to see if this particular reason is already known. If not, please log a new issue, including this full error message. Schema:{"$schema":"http://json-schema.org/draft-07/schema#","$id":"https://app.logary.tech/schemas/event-message.schema.json","allOf":[{"$ref":"https://app.logary.tech/schemas/logary-message-base.schema.json"}],"title":"EventMessage","description":"A Logary event message","type":"object","properties":{"type":{"description":"Discriminator","type":"string","const":"event"},"event":{"description":"What happened?","type":"string","maxLength":1024,"examples":["User clicked \"{cssSelector}\"","User {user.name} logged in","Application error: {error.message}","Unhandled JS error","Cart checkout"]},"monetaryValue":{"description":"How much was lost or gained as a result of this event happening?","$ref":"https://app.logary.tech/schemas/definitions.schema.json#/definitions/money"},"error":{"description":"Optional error associated with this event.","$ref":"https://app.logary.tech/schemas/definitions.schema.json#/definitions/error-info"}},"required":["type","event"]}

Could you try with the latest release if you get the same error? or is there something I'm missing in the schema?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub (https://github.com/confluentinc/ksql/issues/7392#issuecomment-833854845), or unsubscribe (https://github.com/notifications/unsubscribe-auth/AABPEW2R26T23T43SEBILA3TML6DTANCNFSM43EMTTAA).

spena commented 3 years ago

@haf I saw you reported a similar issue with the allOf() problem here https://github.com/confluentinc/ksql/issues/6521

spena commented 3 years ago

Nevermind, I see that https://github.com/confluentinc/ksql/issues/6521 is related to not having the allOf() in the schema. This issue is because it has the allOf() and it is not supported. I just try to reproduce it in 0.15 and CP 6.0 and I don't see the stackoverflow.

spena commented 3 years ago

I'll close this issue as not reproducible. It may be fixed once allOf() is supported in https://github.com/confluentinc/ksql/issues/6521