StarRocks / starrocks

StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries.
https://starrocks.io
Apache License 2.0
8.66k stars 1.75k forks source link

The Routine Load job is set to 300000 max_error_number the same as the error detection window, which is expected to ignore all error data, but when it is found that the error data consumed in Kafka is actually used, it will be stuck and cannot continue to consume data in the future #41333

Closed wypzj closed 1 week ago

wypzj commented 6 months ago

the same code, in doris without any problem.......

Steps to reproduce the behavior (Required)

create routine load crm_template_job on crm_template columns(id, name, ai_type, airobot_firm, airobot_id, params, tenant_id, keyword_group_id, robot_type, default_concurrency_count, industry_scene_id, art_template_id, dm_flag, status, latest_version, art_template_name, created_by, updated_by, open_dialect, dialogue_server, language_code, asr_vendor, tts_vendor, nlu_vendor, intent_library_id, language_info, recordist_id, recordist_name, recordist_phone, statics_switch, additional_recording_nums, additional_recordings, info_collect_config, info_collect_switch, broadcast_mode, template_product_id, template_product_name, extra_info, art_tag_ids, strategy_id) properties( "desired_concurrent_number"="3", "max_batch_interval" = "5", "max_batch_rows" = "300000", "max_error_number" = "300000", "max_batch_size" = "209715200", "strict_mode" = "false", "format" = "json", "json_root" = "$.after" ) FROM KAFKA ( "kafka_broker_list" = "", "kafka_topic" = "", "property.group.id" = "", "property.kafka_default_offsets" = "OFFSET_BEGINNING" );

the next is error log Error: Not found: unable to find key: $.after /build/starrocks/be/src/exec/json_parser.cpp:191 JsonFunctions::extract_from_object(*row, _root_paths, &val). Row: parser current location: {"before":{"id":5296,"name":null,"ai_type":null,"airobot_firm":null,"airobot_id":null,"params":"{}","tenant_id":0,"created_at":0,"updated_at":null,"keyword_group_id":null,"robot_type":1,"default_concurrency_count":0,"industry_scene_id":null,"art_template_id":null,"dm_flag":0,"status":0,"latest_version":null,"art_template_name":null,"created_by":null,"updated_by":null,"open_dialect":1,"dialogue_server":null,"language_code":null,"asr_vendor":null,"tts_vendor":null,"nlu_vendor":null,"intent_library_id":null,"language_info":null,"recordist_id":null,"recordist_name":null,"recordist_phone":null,"recordist_assign_time":null,"statics_switch":0,"additional_recordings":0,"additional_recording_nums":0,"info_collect_switch":0,"info_collect_config":null,"broadcast_mode":2,"template_product_id":null,"template_product_name":null,"extra_info":null,"art_tag_ids":null,"strategy_id":null},"after":null,"source":{"version":"1.9.7.Final","connector":"postgresql","name":"salesclue","ts_ms":1708249643580,"snapshot":"false","db":"jingle","sequence":"[\"46823027661504\",\"46823027661504\"]","schema":"public","table":"crm_template","txId":81580319,"lsn":46823027661504,"xmin":null},"op":"d","ts_ms":1708249644045,"transaction":null}

Expected behavior (Required)

routine load task ignore this data but not blocked

Real behavior (Required)

routine load task continue and ignore this data

StarRocks version (Required)

3.18

github-actions[bot] commented 3 weeks ago

We have marked this issue as stale because it has been inactive for 6 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to StarRocks!