StarRocks / starrocks

The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
https://starrocks.io
Apache License 2.0
8.99k stars 1.81k forks source link

starrocks pipeline connector date format bug #37506

Closed dickson-bit closed 4 months ago

dickson-bit commented 11 months ago

component: flink-cdc-pipeline-starrocks

code: https://github.com/ververica/flink-cdc-connectors/blob/master/flink-cdc-connect/flink-cdc-pipeline-connectors/flink-cdc-pipeline-connector-starrocks/src/main/java/com/ververica/cdc/connectors/starrocks/sink/StarRocksUtils.java#L115

Flink version 1.18

Flink CDC version 3.0.0

Database and its version starrocks 3.2.0

Minimal reproduce step sync some tables from mysql to starrocks, which have date type fields.

What did you expect to see? log the error data info, continue to process the next record.

023-12-21 11:11:09,599 WARN org.apache.flink.runtime.taskmanager.Task [] - PostPartition -> Sink Writer: StarRocks Sink -> Sink Committer: StarRocks Sink (2/2)#0 (57480a1a94083a09b2fd3ddb06db45e4_0deb1b26a3d9eb3c8f0c11f7110b2903_1_0) switched from RUNNING to FAILED with failure cause: java.lang.ArrayIndexOutOfBoundsException: 35 at sun.util.calendar.BaseCalendar.getCalendarDateFromFixedDate(BaseCalendar.java:453) ~[?:1.8.0_372] at java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2393) ~[?:1.8.0_372] at java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2308) ~[?:1.8.0_372] at java.util.Calendar.setTimeInMillis(Calendar.java:1804) ~[?:1.8.0_372] at java.util.Calendar.setTime(Calendar.java:1770) ~[?:1.8.0_372] at java.text.SimpleDateFormat.format(SimpleDateFormat.java:943) ~[?:1.8.0_372] at java.text.SimpleDateFormat.format(SimpleDateFormat.java:936) ~[?:1.8.0_372] at java.text.DateFormat.format(DateFormat.java:345) ~[?:1.8.0_372] at com.ververica.cdc.connectors.starrocks.sink.StarRocksUtils.lambda$createFieldGetter$60c5a152$9(StarRocksUtils.java:171) ~[flink-cdc-pipeline-connector-starrocks-3.0.0.jar:3.0.0] at com.ververica.cdc.connectors.starrocks.sink.StarRocksUtils.lambda$createFieldGetter$21edff26$1(StarRocksUtils.java:204) ~[flink-cdc-pipeline-connector-starrocks-3.0.0.jar:3.0.0] at com.ververica.cdc.connectors.starrocks.sink.EventRecordSerializationSchema.serializeRecord(EventRecordSerializationSchema.java:134) ~[flink-cdc-pipeline-connector-starrocks-3.0.0.jar:3.0.0] at com.ververica.cdc.connectors.starrocks.sink.EventRecordSerializationSchema.applyDataChangeEvent(EventRecordSerializationSchema.java:116) ~[flink-cdc-pipeline-connector-starrocks-3.0.0.jar:3.0.0] at com.ververica.cdc.connectors.starrocks.sink.EventRecordSerializationSchema.serialize(EventRecordSerializationSchema.java:78) ~[flink-cdc-pipeline-connector-starrocks-3.0.0.jar:3.0.0] at com.ververica.cdc.connectors.starrocks.sink.EventRecordSerializationSchema.serialize(EventRecordSerializationSchema.java:45) ~[flink-cdc-pipeline-connector-starrocks-3.0.0.jar:3.0.0] at com.starrocks.connector.flink.table.sink.v2.StarRocksWriter.write(StarRocksWriter.java:139) ~[flink-cdc-pipeline-connector-starrocks-3.0.0.jar:3.0.0] at org.apache.flink.streaming.runtime.operators.sink.SinkWriterOperator.processElement(SinkWriterOperator.java:161) ~[flink-dist-1.18.0.jar:1.18.0] image

dickson-bit commented 11 months ago

image I don't know why do we use DATETIME_FORMATTER and DATE_FORMATER, two format class。

dickson-bit commented 11 months ago

there are some null and unreguler date value in my data

github-actions[bot] commented 4 months ago

We have marked this issue as stale because it has been inactive for 6 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to StarRocks!