streamnative / pulsar-io-lakehouse

pulsar lakehouse connector
Apache License 2.0
30 stars 22 forks source link

[Bug] STRING Schema topic sink failed #183

Open ethqunzhong opened 1 year ago

ethqunzhong commented 1 year ago

I tried to run iceberg sink demo follow with pulsar-io-lakehouse sink docs. it fail commit record because getSchema result unexcepted.

Describe the bug my test flow shows below:

  1. create topic & produce data first, I produce lots of data to test topic persistent://public/default/iceberg_test by Flink-connector. image

message format like: 22061772,1670896138459,mmdc-bigdata-test,11.156.128.57,jobmanager,11.156.128.75,2022-12-13 09:48:58,29 and set topic-schema with bin/pulsar-admin schemas upload command. therefore,test-topic schema show below: image

  1. Run the lakehouse sink connector image logs shows sink iceberg failed with schema exception. image there are two question:
  2. why getSchemaType result different from these two ways: image
  3. record.getSchema().getSchemaInfo().getSchemaDefinition()=null image so records will skiped in sinkWriter.run image and i found that in getSchemaDefinition image if SchemaType=STRING/BYTES, it's SchemaDefinition will always be null cause sink failed.

Environment

4 broker & 1 function-worker (run as a separate process in separate machines.)

ethqunzhong commented 1 year ago

@hangc0276 @zymap @danpi PTAL.

danpi commented 1 year ago

@hangc0276 @zymap @danpi PTAL.

OK, I will take a look.