apache / seatunnel

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
https://seatunnel.apache.org/
Apache License 2.0
8.06k stars 1.83k forks source link

[Bug] [seatunnel] 使用sql transform 增加了两个字段,没有使用as命别名,导致目标表这两个字段没有数据导入 #8012

Closed dwave closed 1 week ago

dwave commented 1 week ago

Search before asking

What happened

使用sql transform 增加了两个字段,没有使用as命别名,导致目标表这两个字段没有数据导入 11ef38e43d85d5289e3574855e0e812 fe0c02387c5537136b75ed394799b38

SeaTunnel Version

2.3.8

SeaTunnel Config

env {
"job.mode"=BATCH
"job.name"="SeaTunnel_Job"
"savemode.execute.location"=CLUSTER
}
source {
Jdbc {
    "connection_check_timeout_sec"="30"
    "fetch_size"="0"
    "use_select_count"="false"
    "skip_analyze"="false"
    "split.size"="80960"
    "split.even-distribution.factor.upper-bound"="100.0"
    "split.even-distribution.factor.lower-bound"="0.05"
    "split.sample-sharding.threshold"="1000"
    "split.inverse-sampling.rate"="1000"
    parallelism="1"
    "result_table_name"=Table15598171050848
    query="SELECT \"item_id\", \"item_code\", \"item_name\", \"sub_id\", \"sub_code\", \"sub_name\", \"big_id\", \"big_code\", \"big_name\", \"indw_time\", \"tenant_id\", \"updw_time\", \"core_dish_flag\", \"area_id\", \"item_crid\", \"stand_price\", \"item_property\", \"item_desc\", \"item_stat\" FROM \"db\".\"dim\".\"dim_item\""
    user=""
    url=""
    password=""
    driver="org.postgresql.Driver"
}
}
transform {
Sql {
    query="select * , cast (CURRENT_DATE() as string), cast (CURRENT_TIME() as string)  from source"
    "result_table_name"=Table15598171050849
    "source_table_name"=Table15598171050848
}
}
sink {
StarRocks {
    "batch_max_rows"="102400"
    "batch_max_bytes"="5242880"
    "enable_upsert_delete"="false"
    "schema_save_mode"="CREATE_SCHEMA_WHEN_NOT_EXIST"
    "data_save_mode"="APPEND_DATA"
    "save_mode_create_template"="CREATE TABLE IF NOT EXISTS `${database}`.`${table}` (\n${rowtype_primary_key},\n${rowtype_fields}\n) ENGINE=OLAP\n PRIMARY KEY (${rowtype_primary_key})\nDISTRIBUTED BY HASH (${rowtype_primary_key})PROPERTIES (\n    \"replication_num\" = \"1\" \n)"
    "http_socket_timeout_ms"="180000"
    "source_table_name"=Table15598171050849
    table="dim_item_with_ingestion_time_v2"
    database=test
    nodeUrls=[
        "10.9.99.31:8030"
    ]
    username=root
    password=
    base-url="jdbc:mysql://10.9.99.31:9030/test"
}
}

Running Command

使用 seatunnel web 1.0.2生成任务,然后分别用seatunnel web和dolphinscheduler执行

Error Exception

没有报错

Zeta or Flink or Spark Version

Zeta

Java or Scala Version

No response

Screenshots

No response

Are you willing to submit PR?

Code of Conduct

liunaijie commented 1 week ago

I don't think this is a bug.

You using cast (CURRENT_DATE() as string) function to add a new column, but without rename the result, And you expect it will write it to ingestion_date column. How do the system know the mapping?