apache / doris

Apache Doris is an easy-to-use, high performance and unified analytics database.
https://doris.apache.org
Apache License 2.0
12.26k stars 3.2k forks source link

[Bug] transport: Error while dialing: dial tcp 0.0.0.0:9090 #36340

Closed zhaopinglu closed 1 week ago

zhaopinglu commented 2 months ago

Search before asking

Version

Doris v 2.1.3

python package: adbc-driver-flightsql 1.0.0 pydoris 1.0.5.3 pydoris-client 1.0.4

What's Wrong?

OperationalError: IO: [FlightSQL] connection error: desc = "transport: Error while dialing: dial tcp 0.0.0.0:9090: connectex: No connection could be made because the target machine actively refused it." (Unavailable; DoGet: endpoint 0: [uri:"grpc+tcp://0.0.0.0:9090"])

possible related issue:

fix Fix Arrow Flight bind wrong Host in Fqdn #34850

'### What You Expected?

'## 1

'################## dbapi_adbc_execute_fetchallarrow, cost:0:00:00.033296, bytes:101, len(arrow_data):6 '## 2 <class 'pandas.core.frame.DataFrame'> RangeIndex: 6 entries, 0 to 5 Data columns (total 1 columns): '# Column Non-Null Count Dtype


0 Database 6 non-null object dtypes: object(1) memory usage: 547.0 bytes None '## 3 Database 0 __internal_schema 1 arrow_flight_sql 2 information_schema 3 mysql 4 pydoris_client_test

How to Reproduce?

Run python script remotely (not in the same vm as doris be'):

import adbc_driver_manager
import adbc_driver_flightsql.dbapi as flight_sql
import pandas
from datetime import datetime
#import pyarrow as pa
#import pyarrow.flight as fl

conn = flight_sql.connect(uri="grpc://x.x.x.x:9090", db_kwargs={
            adbc_driver_manager.DatabaseOptions.USERNAME.value: "root",
            adbc_driver_manager.DatabaseOptions.PASSWORD.value: ""
        },
        autocommit=False
)
cursor = conn.cursor()

#cursor.execute("DROP DATABASE IF EXISTS arrow_flight_sql FORCE;")
#cursor.execute("select now();")
cursor.execute("show databases;")

start_time = datetime.now()
arrow_data = cursor.fetchallarrow()
dataframe = arrow_data.to_pandas()

print("## 1")
print("\n##################\n dbapi_adbc_execute_fetchallarrow" +
      ", cost:" + str(datetime.now() - start_time) +
      ", bytes:" + str(arrow_data.nbytes) +
      ", len(arrow_data):" + str(len(arrow_data))
)
print("## 2")
print(dataframe.info(memory_usage='deep'))
print("## 3")
print(dataframe)

Anything Else?

The tcp packets captured by wireshark show: The doris be server return 'grpc+tcp://0.0.0.0:9090' to client. This could be the reason why client is trying to connect the endpoint: 0.0.0.0:9090

No response

Are you willing to submit PR?

Code of Conduct

drreddy commented 2 months ago

Hi, +1 I'm also facing the same issue, were you able to figure out what the problem is / fix is ?

xinyiZzz commented 2 months ago

Hi @zhaopinglu @drreddy

  1. check Doris BE version and upgrade it to 2.1.3-rc08 or above, in 2.1.3-rc08 fix #34850.
  2. use FE host to connect, //host:9090
  3. show the results of show frontends; and show backends;
  4. do you use fqdn?
drreddy commented 2 months ago

Hi @xinyiZzz

We've upgraded our cluster to doris-2.1.4-rc03-e93678fd1e and retried the same, we are still facing the same issue. We are not using fqdn anywhere.

xinyiZzz commented 2 months ago

Hi @xinyiZzz

We've upgraded our cluster to doris-2.1.4-rc03-e93678fd1e and retried the same, we are still facing the same issue. We are not using fqdn anywhere.

If you could provide the following information, It would be helpful to analyze this issue.

  1. show frontends; execution results
  2. show backends; execution results
  3. fe.log partial log of query from start to end.
  4. be.INFO partial log of query from start to end.
drreddy commented 2 months ago

Hi @xinyiZzz

PFB the responses for the queries:

  1. show frontends output: mysql> show frontends; +-----------------------------------------+----------+-------------+----------+-----------+---------+--------------------+----------+----------+-----------+------+-------+-------------------+---------------------+---------------------+----------+--------+-----------------------------+------------------+ | Name | Host | EditLogPort | HttpPort | QueryPort | RpcPort | ArrowFlightSqlPort | Role | IsMaster | ClusterId | Join | Alive | ReplayedJournalId | LastStartTime | LastHeartbeat | IsHelper | ErrMsg | Version | CurrentConnected | +-----------------------------------------+----------+-------------+----------+-----------+---------+--------------------+----------+----------+-----------+------+-------+-------------------+---------------------+---------------------+----------+--------+-----------------------------+------------------+ | fe_584773d2_b483_446a_bad8_040765483c48 | 10.0.2.4 | 9010 | 8030 | 9030 | 9020 | 9090 | FOLLOWER | true | 339655541 | true | true | 169799 | 2024-07-02 10:46:27 | 2024-07-03 06:28:13 | true | | doris-2.1.4-rc03-e93678fd1e | Yes | +-----------------------------------------+----------+-------------+----------+-----------+---------+--------------------+----------+----------+-----------+------+-------+-------------------+---------------------+---------------------+----------+--------+-----------------------------+------------------+ 1 row in set (0.02 sec)

  2. Show backends output mysql> show backends; +-----------+----------+---------------+--------+----------+----------+--------------------+---------------------+---------------------+-------+----------------------+-----------+------------------+-------------------+---------------+---------------+---------+----------------+--------------------+--------------------------+--------+-----------------------------+-------------------------------------------------------------------------------------------------------------------------------+-------------------------+----------+ | BackendId | Host | HeartbeatPort | BePort | HttpPort | BrpcPort | ArrowFlightSqlPort | LastStartTime | LastHeartbeat | Alive | SystemDecommissioned | TabletNum | DataUsedCapacity | TrashUsedCapacity | AvailCapacity | TotalCapacity | UsedPct | MaxDiskUsedPct | RemoteUsedCapacity | Tag | ErrMsg | Version | Status | HeartbeatFailureCounter | NodeRole | +-----------+----------+---------------+--------+----------+----------+--------------------+---------------------+---------------------+-------+----------------------+-----------+------------------+-------------------+---------------+---------------+---------+----------------+--------------------+--------------------------+--------+-----------------------------+-------------------------------------------------------------------------------------------------------------------------------+-------------------------+----------+ | 13058 | 10.0.2.5 | 9050 | 9060 | 8040 | 8060 | 9091 | 2024-07-02 10:46:58 | 2024-07-03 06:28:53 | true | false | 3645 | 789.687 MB | 1.845 GB | 124.288 GB | 127.936 GB | 2.85 % | 2.85 % | 0.000 | {"location" : "default"} | | doris-2.1.4-rc03-e93678fd1e | {"lastSuccessReportTabletsTime":"2024-07-03 06:28:34","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false} | 0 | mix | | 13059 | 10.0.2.6 | 9050 | 9060 | 8040 | 8060 | 9091 | 2024-07-02 10:47:24 | 2024-07-03 06:28:53 | true | false | 3779 | 3.440 GB | 245.602 MB | 123.218 GB | 127.936 GB | 3.69 % | 3.69 % | 0.000 | {"location" : "default"} | | doris-2.1.4-rc03-e93678fd1e | {"lastSuccessReportTabletsTime":"2024-07-03 06:28:02","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false} | 0 | mix | | 62290 | 10.0.2.7 | 9050 | 9060 | 8040 | 8060 | 9091 | 2024-07-02 10:47:54 | 2024-07-03 06:28:53 | true | false | 3777 | 3.283 GB | 488.622 MB | 123.146 GB | 127.936 GB | 3.74 % | 3.74 % | 0.000 | {"location" : "default"} | | doris-2.1.4-rc03-e93678fd1e | {"lastSuccessReportTabletsTime":"2024-07-03 06:28:23","lastStreamLoadTime":-1,"isQueryDisabled":false,"isLoadDisabled":false} | 0 | mix | +-----------+----------+---------------+--------+----------+----------+--------------------+---------------------+---------------------+-------+----------------------+-----------+------------------+-------------------+---------------+---------------+---------+----------------+--------------------+--------------------------+--------+-----------------------------+-------------------------------------------------------------------------------------------------------------------------------+-------------------------+----------+ 3 rows in set (0.00 sec)

w.r.t the logs I'm not sure for which query you want the log for ?? Also I've have never handled doris logs, could you guide me on how to get the specific logs for a query.

chengxin1374 commented 1 month ago

2.1.4-rc03

i also use doris-version:2.1.4-rc03. Have you solved the problem? I have the same problem as you.

denniskyleung commented 2 weeks ago

Same issue using doris-2.1.5-rc02. However, the ADBC connection works with doris-2.1.2-rc04 with the same configuration.

xinyiZzz commented 1 week ago

@denniskyleung @zhaopinglu @chengxin1374 @drreddy

Fixed in https://github.com/apache/doris/pull/40002, sorry for taking so long.

This problem occurs when query results are returned from FE Arrow Flight server, and Doris FE and ADBC ​​Client are not on the same machine, such as show tables;, show databases;.

no problem when query results are returned from BE Arrow Flight server, such as the normal query statement select * from xxx.