StarRocks / starrocks

StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries.
https://starrocks.io
Apache License 2.0
8.29k stars 1.68k forks source link

starrocks pipeline connector date format bug #37506

Open dickson-bit opened 6 months ago

dickson-bit commented 6 months ago

component: flink-cdc-pipeline-starrocks

code: https://github.com/ververica/flink-cdc-connectors/blob/master/flink-cdc-connect/flink-cdc-pipeline-connectors/flink-cdc-pipeline-connector-starrocks/src/main/java/com/ververica/cdc/connectors/starrocks/sink/StarRocksUtils.java#L115

Flink version 1.18

Flink CDC version 3.0.0

Database and its version starrocks 3.2.0

Minimal reproduce step sync some tables from mysql to starrocks, which have date type fields.

What did you expect to see? log the error data info, continue to process the next record.

023-12-21 11:11:09,599 WARN org.apache.flink.runtime.taskmanager.Task [] - PostPartition -> Sink Writer: StarRocks Sink -> Sink Committer: StarRocks Sink (2/2)#0 (57480a1a94083a09b2fd3ddb06db45e4_0deb1b26a3d9eb3c8f0c11f7110b2903_1_0) switched from RUNNING to FAILED with failure cause: java.lang.ArrayIndexOutOfBoundsException: 35 at sun.util.calendar.BaseCalendar.getCalendarDateFromFixedDate(BaseCalendar.java:453) ~[?:1.8.0_372] at java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2393) ~[?:1.8.0_372] at java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2308) ~[?:1.8.0_372] at java.util.Calendar.setTimeInMillis(Calendar.java:1804) ~[?:1.8.0_372] at java.util.Calendar.setTime(Calendar.java:1770) ~[?:1.8.0_372] at java.text.SimpleDateFormat.format(SimpleDateFormat.java:943) ~[?:1.8.0_372] at java.text.SimpleDateFormat.format(SimpleDateFormat.java:936) ~[?:1.8.0_372] at java.text.DateFormat.format(DateFormat.java:345) ~[?:1.8.0_372] at com.ververica.cdc.connectors.starrocks.sink.StarRocksUtils.lambda$createFieldGetter$60c5a152$9(StarRocksUtils.java:171) ~[flink-cdc-pipeline-connector-starrocks-3.0.0.jar:3.0.0] at com.ververica.cdc.connectors.starrocks.sink.StarRocksUtils.lambda$createFieldGetter$21edff26$1(StarRocksUtils.java:204) ~[flink-cdc-pipeline-connector-starrocks-3.0.0.jar:3.0.0] at com.ververica.cdc.connectors.starrocks.sink.EventRecordSerializationSchema.serializeRecord(EventRecordSerializationSchema.java:134) ~[flink-cdc-pipeline-connector-starrocks-3.0.0.jar:3.0.0] at com.ververica.cdc.connectors.starrocks.sink.EventRecordSerializationSchema.applyDataChangeEvent(EventRecordSerializationSchema.java:116) ~[flink-cdc-pipeline-connector-starrocks-3.0.0.jar:3.0.0] at com.ververica.cdc.connectors.starrocks.sink.EventRecordSerializationSchema.serialize(EventRecordSerializationSchema.java:78) ~[flink-cdc-pipeline-connector-starrocks-3.0.0.jar:3.0.0] at com.ververica.cdc.connectors.starrocks.sink.EventRecordSerializationSchema.serialize(EventRecordSerializationSchema.java:45) ~[flink-cdc-pipeline-connector-starrocks-3.0.0.jar:3.0.0] at com.starrocks.connector.flink.table.sink.v2.StarRocksWriter.write(StarRocksWriter.java:139) ~[flink-cdc-pipeline-connector-starrocks-3.0.0.jar:3.0.0] at org.apache.flink.streaming.runtime.operators.sink.SinkWriterOperator.processElement(SinkWriterOperator.java:161) ~[flink-dist-1.18.0.jar:1.18.0] image

dickson-bit commented 6 months ago

image I don't know why do we use DATETIME_FORMATTER and DATE_FORMATER, two format class。

dickson-bit commented 6 months ago

there are some null and unreguler date value in my data

github-actions[bot] commented 1 week ago

We have marked this issue as stale because it has been inactive for 6 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to StarRocks!