secretflow / scql

SCQL (Secure Collaborative Query Language) is a system that allows multiple distrusting parties to run joint analysis without revealing their private data.
https://www.secretflow.org.cn/docs/scql/en/
Apache License 2.0
126 stars 46 forks source link

scql broker 调用异步获取查询结果的接口(/intra/query/fetch)报错。 #340

Closed a-b-c-3-2-1 closed 3 weeks ago

a-b-c-3-2-1 commented 2 months ago

Issue Type

Install/Build

Have you searched for existing issues?

Yes

OS Platform and Distribution

centos

SCQL Version

0.8.1b1

What happend and What you expected to happen.

scql p2p模式下,在同一项目下调用接口查询。

1:同步查询任务成功执行,返回执行结果

接口:/intra/query

参数:
{
"project_id": "yyy",  
"query": "SELECT ta.ID FROM ta INNER JOIN tb ON ta.ID = tb.ID;"
}

2:异步查询任务提交任务成功,获取结果时报错

提交任务接口:/intra/query/submit
参数:
{
"project_id": "yyy",  
"query": "SELECT ta.ID FROM ta INNER JOIN tb ON ta.ID = tb.ID;"
}

返回值:
{
    "status": {
        "code": 0,
        "message": "submit query job 7bd62808-65ac-11ef-aa8e-***** succeed",
        "details": [

        ]
    },
    "job_id": "7bd62808-65ac-11ef-aa8e-*****"
}

获取查询结果接口:/intra/query/fetch

参数:

{
"job_id": "7bd62808-65ac-11ef-aa8e-*****"
}

报错信息:

{"status":{"code":300,"message":"FetchResult: failed in session result check: QueryResponse error: status = code:300 message:\"job has failed in engine, likely due to either engine crash or being OOM killed\"","details":[]},"result":null,"job_status":null}

Configuration used to run SCQL.

SCQL log output.

部分错误日志:

engine Alice

2024-08-29 03:07:22.927 [info] [engine_service_impl.cc:RunPlanCore:537] [job(cf867f15-65b3-11ef-aa8e-*****)] session(cf867f15-65b3-11ef-aa8e-*****) run plan policy succ
2024-08-29 03:07:22.927 [info] [engine_service_impl.cc:RunPlanSync:571] [job(cf867f15-65b3-11ef-aa8e-*****)] RunExecutionPlan success, sessionID=cf867f15-65b3-11ef-aa8e-*****
2024-08-29 03:07:22.928 [error] [http_rpc_protocol.cpp:BRPC:1640] [scqlengine] Invalid host= port=80
2024-08-29 03:07:22.928 [error] [channel.cpp:BRPC:249] [scqlengine] Fail to parse address=`//broker:8080/intra/cb/engine'
2024-08-29 03:07:22.928 [warning] [engine_service_impl.cc:ReportResult:471] [job(cf867f15-65b3-11ef-aa8e-*****)] ReportResult(cf867f15-65b3-11ef-aa8e-*****) failed, catch std::exception=[engine/link/channel_manager.cc:57] BrpcChannel Init failed, ret=-1, remote_addr=//broker:8080/intra/cb/engine, load_balancer=2, role=, protocol=http
2024-08-29 03:07:22.929 [info] [session_manager.cc:RemoveSession:226] [scqlengine] session(cf867f15-65b3-11ef-aa8e-*****) removed, running_cost(167ms), current running session=0
2024-08-29 03:07:32.316 [warning] [session_manager.cc:GetSession:156] [scqlengine] session(cf867f15-65b3-11ef-aa8e-*****) not exists. default return nullptr.
2024-08-29 03:07:32.316 [error] [engine_service_impl.cc:QueryJobStatus:369] [scqlengine] QueryJobStatus failed, job(cf867f15-65b3-11ef-aa8e-*****) not found
2024-08-29 03:07:32.317 [warning] [session_manager.cc:GetSession:156] [scqlengine] session(cf867f15-65b3-11ef-aa8e-*****) not exists. default return nullptr.
2024-08-29 03:07:32.317 [warning] [session.cc:ActiveLogger:336] [scqlengine] can not get valid session
2024-08-29 03:07:32.317 [info] [engine_service_impl.cc:StopJob:173] [scqlengine] EngineServiceImpl::StopJob(cf867f15-65b3-11ef-aa8e-*****), reason()
2024-08-29 03:07:32.317 [warning] [session_manager.cc:StopSession:174] [scqlengine] session(cf867f15-65b3-11ef-aa8e-*****) not exists.

engine BOB

2024-08-29 02:53:45.913 [info] [engine_service_impl.cc:RunPlanCore:537] [job(cf867f15-65b3-11ef-aa8e-*****)] session(cf867f15-65b3-11ef-aa8e-*****) run plan policy succ
2024-08-29 02:53:45.914 [info] [engine_service_impl.cc:RunPlanSync:571] [job(cf867f15-65b3-11ef-aa8e-*****)] RunExecutionPlan success, sessionID=cf867f15-65b3-11ef-aa8e-*****
2024-08-29 02:53:45.915 [error] [http_rpc_protocol.cpp:BRPC:1640] [scqlengine] Invalid host= port=80
2024-08-29 02:53:45.915 [error] [channel.cpp:BRPC:249] [scqlengine] Fail to parse address=`//broker:8080/intra/cb/engine'
2024-08-29 02:53:45.915 [warning] [engine_service_impl.cc:ReportResult:471] [job(cf867f15-65b3-11ef-aa8e-*****)] ReportResult(cf867f15-65b3-11ef-aa8e-*****) failed, catch std::exception=[engine/link/channel_manager.cc:57] BrpcChannel Init failed, ret=-1, remote_addr=//broker:8080/intra/cb/engine, load_balancer=2, role=, protocol=http
2024-08-29 02:53:45.915 [info] [session_manager.cc:RemoveSession:226] [scqlengine] session(cf867f15-65b3-11ef-aa8e-*****) removed, running_cost(148ms), current running session=0
2024-08-29 02:54:41.316 [warning] [session_manager.cc:GetSession:156] [scqlengine] session(cf867f15-65b3-11ef-aa8e-*****) not exists. default return nullptr.
2024-08-29 02:54:41.316 [error] [engine_service_impl.cc:QueryJobStatus:369] [scqlengine] QueryJobStatus failed, job(cf867f15-65b3-11ef-aa8e-*****) not found
2024-08-29 02:54:41.317 [warning] [session_manager.cc:GetSession:156] [scqlengine] session(cf867f15-65b3-11ef-aa8e-*****) not exists. default return nullptr.
2024-08-29 02:54:41.317 [warning] [session.cc:ActiveLogger:336] [scqlengine] can not get valid session
2024-08-29 02:54:41.317 [info] [engine_service_impl.cc:StopJob:173] [scqlengine] EngineServiceImpl::StopJob(cf867f15-65b3-11ef-aa8e-*****), reason()
2024-08-29 02:54:41.317 [warning] [session_manager.cc:StopSession:174] [scqlengine] session(cf867f15-65b3-11ef-aa8e-*****) not exists.
a-b-c-3-2-1 commented 2 months ago

你好,根据你的engine日志看,是没法访问scdb。你的engine和scdb在同一个机器上吗?

我用的是P2p模式呀,应该没有scdb吧。

BrainWH commented 2 months ago

你好,可以提供一下Docker Compose的配置信息

a-b-c-3-2-1 commented 2 months ago

你好,可以提供一下Docker Compose的配置信息

version: '3.8' services: broker: image: secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/scql:latest command:

BrainWH commented 2 months ago

你好,你可以贴下config.yml和gflags.conf。你的mysql是在本地配置的吗?

a-b-c-3-2-1 commented 2 months ago

你好,你可以贴下config.yml和gflags.conf。你的mysql是在本地配置的吗?

config.yml

intra_server: host: 0.0.0.0 port: 8080 inter_server: host: 0.0.0.0 port: 8081 log_level: debug party_code: alice session_expire_time: 24h session_expire_check_time: 1m party_info_file: "/home/admin/configs/party_info.json" private_key_path: "/home/admin/configs/ed25519key.pem" intra_host: broker:8080 engine: timeout: 120s protocol: http content_type: application/json uris:

gflags.cong

--listen_port=8003 --datasource_router=embed --enable_driver_authorization=false --server_enable_ssl=false --driver_enable_ssl_as_client=false --peer_engine_enable_ssl_as_client=false --embed_router_conf={"datasources": [{"id": "ds001", "name": "mysql db", "kind": "MYSQL", "connection_str": "db=alice;user=mysql;password=password;host=192.168..;auto-reconnect=true"}], "rules": [{"db": "", "table": "", "datasource_id": "ds001"}]} --enable_self_auth=false --enable_peer_auth=false

配置信息如上,数据库是配置的本地数据库,没有使用docker。 同步查询是可以得到结果的呢,只有异步的时候报错啦,感谢。

tongke6 commented 2 months ago

engine 的日志贴全一点哈,看起来是 callback 的地址有问题

a-b-c-3-2-1 commented 2 months ago

engine 的日志贴全一点哈,看起来是 callback 的地址有问题

2024-08-29 02:53:45.915 [error] [http_rpc_protocol.cpp:BRPC:1640] [scqlengine] Invalid host= port=80 2024-08-29 02:53:45.915 [error] [channel.cpp:BRPC:249] [scqlengine] Fail to parse address=`//broker:8080/intra/cb/engine'

我认为主要是engine 的这两行日志错误的原因,对应的应该是config.yml里的这行配置intra_host: broker:8080,这里具体应该怎么写呢有点疑惑。有个例子啥的吗。

tongke6 commented 2 months ago

请参考配置文档:https://www.secretflow.org.cn/zh-CN/docs/scql/0.9.0b1/reference/p2p-deploy-config

a-b-c-3-2-1 commented 2 months ago

请参考配置文档:https://www.secretflow.org.cn/zh-CN/docs/scql/0.9.0b1/reference/p2p-deploy-config

配置文档中https://www.secretflow.org.cn/zh-CN/docs/scql/0.9.0b1/reference/p2p-deploy-config,说要这样填:

intra_host: http://broker_alice:8080

部署文档中https://www.secretflow.org.cn/zh-CN/docs/scql/0.9.0b1/topics/deployment/how-to-deploy-p2p-cluster说要这样填

intra_host: broker:8080

我按两种方式填了都会报错:

2024-09-02 10:19:31.478 [error] [http_rpc_protocol.cpp:BRPC:1640] [scqlengine] Invalid host= port=80 2024-09-02 10:19:31.478 [error] [channel.cpp:BRPC:249] [scqlengine] Fail to parse address=`//http:%2F%2Fbroker:8080/intra/cb/engine'

或者:

2024-08-29 02:53:45.915 [error] [http_rpc_protocol.cpp:BRPC:1640] [scqlengine] Invalid host= port=80 2024-08-29 02:53:45.915 [error] [channel.cpp:BRPC:249] [scqlengine] Fail to parse address=`//broker:8080/intra/cb/engine'

所以有点困惑。这个字段中填的broker就是 docker-compose 文件中的

services: broker:

这个名字吗,还是别的什么,希望指点一下。

tongke6 commented 2 months ago

因为 docker-compose 里的 service 是 broker,所以它的 hostname 就是 broker。也就是填成 http://broker:8080 试一下

a-b-c-3-2-1 commented 2 months ago

因为 docker-compose 里的 service 是 broker,所以它的 hostname 就是 broker。也就是填成 http://broker:8080 试一下

然后就变成了这样呢:

2024-09-02 11:04:30.387 [info] [engine_service_impl.cc:RunPlanCore:507] [job(1ff9ad47-691b-11ef-8cfe-0242ac180002)] session(1ff9ad47-691b-11ef-8cfe-0242ac180002) finished executing node(filter_by_index.3), op(FilterByIndex), cost(0)ms 2024-09-02 11:04:30.387 [info] [engine_service_impl.cc:RunPlanCore:494] [job(1ff9ad47-691b-11ef-8cfe-0242ac180002)] session(1ff9ad47-691b-11ef-8cfe-0242ac180002) start to execute node(publish.5), op(Publish) RecordPublishNodeDetail Start 2024-09-02 11:04:30.387 [info] [engine_service_impl.cc:RunPlanCore:507] [job(1ff9ad47-691b-11ef-8cfe-0242ac180002)] session(1ff9ad47-691b-11ef-8cfe-0242ac180002) finished executing node(publish.5), op(Publish), cost(0)ms 2024-09-02 11:04:30.387 [info] [engine_service_impl.cc:RunPlanCore:537] [job(1ff9ad47-691b-11ef-8cfe-0242ac180002)] session(1ff9ad47-691b-11ef-8cfe-0242ac180002) run plan policy succ 2024-09-02 11:04:30.387 [info] [engine_service_impl.cc:RunPlanSync:571] [job(1ff9ad47-691b-11ef-8cfe-0242ac180002)] RunExecutionPlan success, sessionID=1ff9ad47-691b-11ef-8cfe-0242ac180002 2024-09-02 11:04:30.388 [error] [http_rpc_protocol.cpp:BRPC:1640] [scqlengine] Invalid host= port=80 2024-09-02 11:04:30.388 [error] [channel.cpp:BRPC:249] [scqlengine] Fail to parse address=`//http:%2F%2Fbroker:8080/intra/cb/engine' 2024-09-02 11:04:30.397 [warning] [engine_service_impl.cc:ReportResult:471] [job(1ff9ad47-691b-11ef-8cfe-0242ac180002)] ReportResult(1ff9ad47-691b-11ef-8cfe-0242ac180002) failed, catch std::exception=[engine/link/channel_manager.cc:57] BrpcChannel Init failed, ret=-1, remote_addr=//http:%2F%2Fbroker:8080/intra/cb/engine, load_balancer=2, role=, protocol=http 2024-09-02 11:04:30.397 [info] [session_manager.cc:RemoveSession:226] [scqlengine] session(1ff9ad47-691b-11ef-8cfe-0242ac180002) removed, running_cost(1040ms), current running session=0 2024-09-02 11:05:02.213 [warning] [session_manager.cc:GetSession:156] [scqlengine] session(1ff9ad47-691b-11ef-8cfe-0242ac180002) not exists. default return nullptr. 2024-09-02 11:05:02.213 [error] [engine_service_impl.cc:QueryJobStatus:369] [scqlengine] QueryJobStatus failed, job(1ff9ad47-691b-11ef-8cfe-0242ac180002) not found

tongke6 commented 2 months ago

broker 的配置的 intra_server 少了 protocol 的配置,如下所示:

intra_server:
  protocol: http

增加这个配置后,将 intra_host 改成 broker:8080 试一下

a-b-c-3-2-1 commented 2 months ago

broker 的配置的 intra_server 少了 protocol 的配置,如下所示:

intra_server:
  protocol: http

增加这个配置后,将 intra_host 改成 broker:8080 试一下

的确是这个问题,感谢。

github-actions[bot] commented 1 month ago

Stale issue message. Please comment to remove stale tag. Otherwise this issue will be closed soon.