confluentinc / ksql

The database purpose-built for stream processing applications.
https://ksqldb.io
Other
45 stars 1.04k forks source link

with complex avro data,ksql v5.3.0 will cause a error when restart #3169

Open yunhappy opened 4 years ago

yunhappy commented 4 years ago

then ksql-server will cause a error log:

Cannot register avro schema for TEST_AVRO as the schema registry rejected it, (maybe schema evolution issues?)

i think, maybe caused by this:

in AvroUtil SchemaUtil.buildAvroSchema will build a schema with schema=Schema{MAP} but in GenericRowSerDe.java it will build a schema with

Schema{io.confluent.ksql.avro_schemas.KsqlDataSourceSchema_EXTRA:MAP}

xs005 commented 4 years ago
  1. Have you set schema registry url in ksql-server?
  2. Have you checked if schema registry works well?
  3. Have you change the log level to DEBUG to check some errors?
  4. You could use Confluent Control Center or Landoop schema registry UI or KafkaHQ to check the content of the schema of that topic
yunhappy commented 4 years ago

anything works well

{
"name": "CONTACTINFO",
"type": [
"null",
{
"type": "array",
"items": {
"type": "record",
"name": "KsqlDataSourceSchema_CONTACTINFO",
"fields": [
{
"name": "key",
"type": [
"null",
"string"
],
"default": null
},
{
"name": "value",
"type": [
"null",
"string"
],
"default": null
}
],
"connect.internal.type": "MapEntry"
}
}
],
"default": null
}

but when schemaRegistryClient.testCompatibility,the schema is:

 {
      "name": "CONTACTINFO",
      "type": [
        "null",
        {
          "type": "map",
          "values": [
            "null",
            "string"
          ]
        }
      ],
      "default": null
    },
xs005 commented 4 years ago

The schema I got from a topic created by KSQL is below. I have no clue about complex schema. Sorry~

{
  "type": "record",
  "name": "KsqlDataSourceSchema",
  "namespace": "io.confluent.ksql.avro_schemas",
  "fields": [
    {
      "name": "KSQL_INTERNAL_COL_0",
      "type": [
        "null",
        "long"
      ],
      "default": null
    },
    {
      "name": "KSQL_INTERNAL_COL_1",
      "type": [
        "null",
        "string"
      ],
      "default": null
    }
  ]
}
yunhappy commented 4 years ago

thanks reply i follow this: https://docs.confluent.io/current/ksql/docs/tutorials/generate-custom-test-data.html#generate-example-user-records-with-complex-data

avro schema:

{
    "type": "record",
    "name": "KsqlDataSourceSchema",
    "namespace": "io.confluent.ksql.avro_schemas",
    "fields": [
        {
            "name": "REGISTERTIME",
            "type": [
                "null",
                "long"
            ],
            "default": null
        },
        {
            "name": "GENDER",
            "type": [
                "null",
                "string"
            ],
            "default": null
        },
        {
            "name": "REGIONID",
            "type": [
                "null",
                "string"
            ],
            "default": null
        },
        {
            "name": "USERID",
            "type": [
                "null",
                "string"
            ],
            "default": null
        },
        {
            "name": "INTERESTS",
            "type": [
                "null",
                {
                    "type": "array",
                    "items": [
                        "null",
                        "string"
                    ]
                }
            ],
            "default": null
        },
        {
            "name": "CONTACTINFO",
            "type": [
                "null",
                {
                    "type": "array",
                    "items": {
                        "type": "record",
                        "name": "KsqlDataSourceSchema_CONTACTINFO",
                        "fields": [
                            {
                                "name": "key",
                                "type": [
                                    "null",
                                    "string"
                                ],
                                "default": null
                            },
                            {
                                "name": "value",
                                "type": [
                                    "null",
                                    "string"
                                ],
                                "default": null
                            }
                        ],
                        "connect.internal.type": "MapEntry"
                    }
                }
            ],
            "default": null
        }
    ]
}
stefanloerwald commented 4 years ago

I'm also affected by this issue. And it makes ksql completely worthless if those containers are not fault-tolerant... It looks to me like it is related to issue #3759 ksql transforms maps to array of record, where the record has key/value as fields. On reboot of the ksql container / restart of the query, it seems to compare the schema that it produced in the first round (i.e. with transformation) to the untransformed schema, which is obviously incompatible. The schema server will reject an "evolution" from array of records (which was written in the first round) to map (attempt of comparison in second round).

stefanloerwald commented 4 years ago

@rodesai could you please have a look at this?

Endemoniada commented 2 years ago

Did anyone ever find a resolution for this? We're in a very special, uncomfortable situation where we're still running KSQL 5.3.1 and have to update the query file we run with. However, we're running into this issue when restarting KSQL. In our lab environment, I tested removing all topics and schemas and letting the first of our two KSQL nodes recreate them, and that worked fine, but starting the second one gives this same error, as does changing the query file and restarting KSQL.