StarRocks / starrocks

The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
https://starrocks.io
Apache License 2.0
8.83k stars 1.77k forks source link

import json data with routine load can't success if json data contains unescaped character #45273

Open kaijianding opened 5 months ago

kaijianding commented 5 months ago

Steps to reproduce the behavior (Required)

  1. CREATE routine load to ingest json data
  2. unescaped character in json img_v3_02am_88bbf69b-9bcb-4347-8c0a-f0c9044049fg

Expected behavior (Required)

  1. task success with this row skipped
  2. errorRows increases

Real behavior (Required)

routine load tasks failed repeatly:

2024-05-08 09:24:04,644 WARN (thrift-server-pool-667203|857580) [RoutineLoadJob.executeTaskOnTxnStatusChanged():1048] routine load task [job name xxxx, task id 36869260-d0ce-4ebd-bec8-423d6d78e08f] aborted because of Failed to iterate document stream as object. error: Within strings, some characters must be escaped, we found unescaped characters, remove old task and generate new one

StarRocks version (Required)

rickif commented 5 months ago

Thanks. I'll look into this issue.

jaogoy commented 5 months ago

We need to support auto escape in JSON string.