apache / seatunnel

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
https://seatunnel.apache.org/
Apache License 2.0
8.07k stars 1.83k forks source link

[Feature][Connector-v2-doris] Can Doris-Connector support multi-be for streamloding directly? #7092

Open chess3cake opened 4 months ago

chess3cake commented 4 months ago

Search before asking

Description

When SeaTunnel servers and Doris BE nodes are in different network areas, the FE Stream Load API may return 307 or unauthorized error.

For example, if the SeaTunnel servers are in K8s-A and the Doris cluster is in K8s-B, the Connector v2 Doris uses the Doris FE API for streaming load. This API will be redirected to the internal IP of some Doris BE in K8s-B, causing the process to fail.

https://doris.apache.org/docs/data-operate/import/stream-load-manual?_highlight=streamload#coding-with-streamload image

https://doris.apache.org/docs/faq/data-faq#q1-use-stream-load-to-access-fes-public-network-address-to-import-data-but-is-redirected-to-the-intranet-ip image

https://github.com/apache/doris/issues/8100

Usage Scenario

No response

Related issues

No response

Are you willing to submit a PR?

Code of Conduct

chess3cake commented 4 months ago

Also in org.apache.seatunnel.connectors.doris.rest.RestService.findPartitions(), the _query_plan API will return internal be domain.If seatunnel server can not ping the be domain, task will fail. image

github-actions[bot] commented 3 months ago

This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.