secretflow / scql

SCQL (Secure Collaborative Query Language) is a system that allows multiple distrusting parties to run joint analysis without revealing their private data.
https://www.secretflow.org.cn/docs/scql/en/
Apache License 2.0
126 stars 47 forks source link

csql MPC 框架是否使用的随机端口? #248

Closed songsong124 closed 8 months ago

songsong124 commented 8 months ago

csql MPC 框架是否使用的随机端口? 通过代理部分端口,报如下错误

2024-02-22 16:43:18.2224 ERROR sync_executor.go:89 |RequestID:|SessionID:67513fc4-d15e-11ee-9e1e-0242ac130002|ActionName:EngineStub@RunExecutionPlan|CostTime:10.00359913s|Reason:InvalidResponse|ErrorMsg:Error: code=320, msg="RunExecutionPlan create session(67513fc4-d15e-11ee-9e1e-0242ac130002) failed, catch std::exception=[external/yacl/yacl/link/context.cc:140] connect to mesh failed, failed to setup connection to rank=0 "|Request: 2024-02-22 16:43:18.2224 ERROR submit_and_get_handler.go:65 |RequestID:1234|SessionID:67513fc4-d15e-11ee-9e1e-0242ac130002|ActionName:SCDBSubmitAndGetHandler@/public/submit_and_get|CostTime:10.010997065s|Reason:InvalidRequest|ErrorMsg:RunExecutionPlan create session(67513fc4-d15e-11ee-9e1e-0242ac130002) failed, catch std::exception=[external/yacl/yacl/link/context.cc:140] connect to mesh failed, failed to setup connection to rank=0 |Request:user:{user:{account_system_type:NATIVE_USER native_user:{name:"alice_node"}}} query:"select a.id from alice_node_test as a join bob_node_test as b on a.id=b.id;" biz_request_id:"1234" db_name:"default"|ClientIP:172.19.0.1 2024-02-22 16:43:18.2224 INFO server.go:146 |GIN|status=200|method=POST|path=/public/submit_and_get|ip=172.19.0.1|latency=10.011045715s|

Chrisdehe commented 8 months ago

没有get到哈~端口和mpc框架没有关系的,辛苦贴一下你的步骤,以及展开讲下你准备实现的任务

songsong124 commented 8 months ago

我需要限制部分端口对外访问、两个engine 进行通信交互,是通过固定端口 进行交互的吗

songsong124 commented 8 months ago

目前 我只是开放了 8082 和8022 端口 image 但是会报错,最上面发的内容错误

Chrisdehe commented 8 months ago

是固定端口,修改engine配置也可以指定端口

songsong124 commented 8 months ago

这个错误是怎么引发的?我使用的查询sql 是: select a.name from alice_node_test as a join bob_node_test as b on a.id=b.id;

ERROR sync_executor.go:89 |RequestID:|SessionID:ed92f5d2-d16a-11ee-9e1e-0242ac130002|ActionName:EngineStub@RunExecutionPlan|CostTime:10.003179638s|Reason:InvalidResponse|ErrorMsg:Error: code=320, msg="RunExecutionPlan create session(ed92f5d2-d16a-11ee-9e1e-0242ac130002) failed, catch std::exception=[external/yacl/yacl/link/context.cc:140] connect to mesh failed, failed to setup connection to rank=0 "|Request:

songsong124 commented 8 months ago

scdb 报错 ERROR sync_executor.go:89 |RequestID:|SessionID:ed92f5d2-d16a-11ee-9e1e-0242ac130002|ActionName:EngineStub@RunExecutionPlan|CostTime:10.003179638s|Reason:InvalidResponse|ErrorMsg:Error: code=320, msg="RunExecutionPlan create session(ed92f5d2-d16a-11ee-9e1e-0242ac130002) failed, catch std::exception=[external/yacl/yacl/link/context.cc:140] connect to mesh failed, failed to setup connection to rank=0 "|Request:

engine 报错内容: 2024-02-22 18:42:29.834 [error] [engine_service_impl.cc:RunExecutionPlan:240] RunExecutionPlan create session(01e26179-d16f-11ee-9e1e-0242ac130002) failed, catch std::exception=[external/yacl/yacl/link/transport/channel.cc:83] Get data timeout, key=connect_1 看 报错日志仿佛是通信的问题

jingshi-ant commented 8 months ago

报错的内容是engien之间没法通信,方便问下您的部署环境吗? engine是直接部署在物理机上的?还是通过docker部署? engine的gflags.conf中--listen_port配置的端口是?

jingshi-ant commented 8 months ago

engine alice是部署在其他机器上吗? 设置的地址是?

songsong124 commented 8 months ago

engine 是通过docker 部署的,alice 部署一个engine .bob 部署一个engine。分别是两台机器。两台机器间网络不能互通,能通过代理开放端口,进行互联 配置:--listen_port=8080

songsong124 commented 8 months ago

[root@ppcp-b hadoop]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES a070766b86d2 2c68c888711d "/home/admin/bin/scd…" 4 hours ago Up 3 hours 0.0.0.0:8082->8080/tcp, :::8082->8080/tcp scdb-scdb-1 fc500bce28e2 mpc-sql-flask:v1.0 "python app.py" 4 hours ago Up 4 hours 0.0.0.0:5000->5000/tcp, :::5000->5000/tcp mpc-sql-flask 95eaa67738a2 2c68c888711d "/home/admin/bin/scq…" 4 hours ago Up 4 hours 0.0.0.0:8022->8080/tcp, :::8022->8080/tcp engine-engine-1

jingshi-ant commented 8 months ago

确认下:scdb、engine bob在同一个机器上,目前可以看到错误日志。 另一个机器的engine alice有日志吗?有接收到scdb请求吗?

songsong124 commented 8 months ago

2024-02-22 18:57:17.612 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:57:21.112 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:57:24.613 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:57:28.113 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:57:31.613 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:57:35.113 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:57:38.614 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:57:42.114 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:57:45.614 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:57:49.114 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:57:52.615 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:57:56.115 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:57:59.615 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:58:03.115 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:58:06.616 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:58:10.116 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:58:13.616 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:58:17.116 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:58:20.617 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:58:24.117 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:58:27.617 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:58:31.117 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:58:34.618 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:58:38.118 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:58:41.618 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:58:45.118 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:58:48.619 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:58:52.119 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:58:55.619 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:58:59.119 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:59:02.620 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:59:06.120 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:59:09.620 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:59:13.120 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:59:16.621 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:59:20.121 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:59:23.621 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:59:27.121 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:59:30.622 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:59:34.122 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out

songsong124 commented 8 months ago

另外一台engine 刷上面的日志

jingshi-ant commented 8 months ago

有其它报错吗?网络上要确保两个机器能互通。

songsong124 commented 8 months ago

2024-02-22 19:02:17.606 [error] [engine_service_impl.cc:RunExecutionPlan:240] RunExecutionPlan create session(d1c5a9ff-d171-11ee-9e1e-0242ac130002) failed, catch std::exception=[external/yacl/yacl/link/context.cc:140] connect to mesh failed, failed to setup connection to rank=0

songsong124 commented 8 months ago

没有其他日志了、

jingshi-ant commented 8 months ago

信息有点凌乱,能用A、B标识下机器,并整理下机器A,B上的scdb/engin的日志日志吗。 另外网络问题可以直接登录container后去curl试试通不通

songsong124 commented 8 months ago

B(engine):

2024-02-22 19:02:17.606 [error] [engine_service_impl.cc:RunExecutionPlan:240] RunExecutionPlan create session(d1c5a9ff-d171-11ee-9e1e-0242ac130002) failed, catch std::exception=[external/yacl/yacl/link/context.cc:140] connect to mesh failed, failed to setup connection to rank=0

[socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:59:23.621 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:59:27.121 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:59:30.622 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out 2024-02-22 18:59:34.122 [warning] [socket.cpp:BRPC:1270] Fail to wait EPOLLOUT of fd=8: Connection timed out

A:

songsong124 commented 35 minutes ago scdb 报错 ERROR sync_executor.go:89 |RequestID:|SessionID:ed92f5d2-d16a-11ee-9e1e-0242ac130002|ActionName:EngineStub@RunExecutionPlan|CostTime:10.003179638s|Reason:InvalidResponse|ErrorMsg:Error: code=320, msg="RunExecutionPlan create session(ed92f5d2-d16a-11ee-9e1e-0242ac130002) failed, catch std::exception=[external/yacl/yacl/link/context.cc:140] connect to mesh failed, failed to setup connection to rank=0 "|Request:

engine 报错内容: 2024-02-22 18:42:29.834 [error] [engine_service_impl.cc:RunExecutionPlan:240] RunExecutionPlan create session(01e26179-d16f-11ee-9e1e-0242ac130002) failed, catch std::exception=[external/yacl/yacl/link/transport/channel.cc:83] Get data timeout, key=connect_1

songsong124 commented 8 months ago

engine网络应该是通的,不然两边也不会都刷日志,在想,engine间交互是不是有其他端口。我这网络虽然不能直通,但是通过代理放开端口,我在代理上只开了8022和8082端口,分别是engine和 scdb的端口。B上的日志 1270是什么端口?

jingshi-ant commented 8 months ago

engine执行任务的时候会互相建立连接,1270是行数,不是端口号. 目前看是scdb可以发请求到两个engine,但engine之间没法访问,A上的engine地址对于scdb和B的地址会不一样吗?alter user create user设定的地址是?

songsong124 commented 8 months ago

image woxia我现在的拓扑图如上所示。 你说的alter user create user 这个怎么查看?

jingshi-ant commented 8 months ago

image alter user是设置的endpoint,engine a,b会互相通过endpoint发送消息;scdb也通过这个endpoint去访问各engine。 具体设置的endpoint可以在mysql的scdb.user表里看到记录;scdb日志中也能搜到engine对应的url。

jingshi-ant commented 8 months ago

从你的代理图上看,A engine 访问B engine的时候,地址是 20.1.1.2,但实际上SCDB认为B engine的地址是 10.1.1。这个地址从A engine上没法访问,所以异常?

songsong124 commented 8 months ago

从你的代理图上看,A engine 访问B engine的时候,地址是 20.1.1.2,但实际上SCDB认为B engine的地址是 10.1.1。这个地址从A engine上没法访问,所以异常?

有可能是你说的这种情况,能通过什么验证,或者解决这个问题。

jingshi-ant commented 8 months ago

scdb模式下,一个engine只有一个地址(通过alter user配置),这个地址需要在各个机器上都可以访问(engine A也只知道SCDB传下来的engine B的地址,没法知晓代理地址);验证的话,确认下scdb日志中的engine B地址是不是域内地址(域外无法访问)即可 解决思路: 1.将scdb部署在另一个机器上,这样就不存在engine地址不一致的问题; 2.或者使用p2p模式,p2p模式可以给engine配置域内域外地址。

songsong124 commented 8 months ago

B(scdb) 运行sql select a.id from alice_node_test as a join bob_node_test as b on a.id=b.id; 完整的日志:

2024-02-23 11:03:47.22311 DEBUG app.go:211 [Translator] plan in one string: Join{DataScan(a)->DataScan(b)}([eq(default.alice_node_test.id, default.bob_node_test.id)],)->Projection([default.alice_node_test.id]) 2024-02-23 11:03:47.22311 DEBUG app.go:212 [Translator] logical plan: digraph G { 1 [label="DataScan(a)"] 2 [label="DataScan(b)"] 3 [label="inner join([eq(default.alice_node_test.id, default.bob_node_test.id)],)"] 4 [label="Projection([default.alice_node_test.id])"] 3 -> 1 3 -> 2 4 -> 3 }

2024-02-23 11:03:47.22311 DEBUG app.go:278 CCL: [party_code:"bob_test" visibility:PLAINTEXT database_name:"default" table_name:"alice_node_test" column_name:"age" party_code:"bob_test" visibility:PLAINTEXT database_name:"default" table_name:"alice_node_test" column_name:"height" party_code:"bob_test" visibility:PLAINTEXT database_name:"default" table_name:"alice_node_test" column_name:"id" party_code:"bob_test" visibility:PLAINTEXT database_name:"default" table_name:"alice_node_test" column_name:"income" party_code:"bob_test" visibility:PLAINTEXT database_name:"default" table_name:"alice_node_test" column_name:"name" party_code:"bob_test" visibility:PLAINTEXT_AFTER_JOIN database_name:"default" table_name:"bob_node_test" column_name:"age" party_code:"bob_test" visibility:PLAINTEXT_AFTER_JOIN database_name:"default" table_name:"bob_node_test" column_name:"height" party_code:"bob_test" visibility:PLAINTEXT_AFTER_JOIN database_name:"default" table_name:"bob_node_test" column_name:"id" party_code:"bob_test" visibility:PLAINTEXT_AFTER_JOIN database_name:"default" table_name:"bob_node_test" column_name:"income" party_code:"bob_test" visibility:PLAINTEXT database_name:"default" table_name:"bob_node_test" column_name:"name" party_code:"alice_node" visibility:PLAINTEXT_AFTER_JOIN database_name:"default" table_name:"alice_node_test" column_name:"age" party_code:"alice_node" visibility:PLAINTEXT_AFTER_JOIN database_name:"default" table_name:"alice_node_test" column_name:"height" party_code:"alice_node" visibility:PLAINTEXT_AFTER_JOIN database_name:"default" table_name:"alice_node_test" column_name:"id" party_code:"alice_node" visibility:PLAINTEXT_AFTER_JOIN database_name:"default" table_name:"alice_node_test" column_name:"income" party_code:"alice_node" visibility:PLAINTEXT_AFTER_JOIN database_name:"default" table_name:"alice_node_test" column_name:"name" party_code:"alice_node" visibility:PLAINTEXT database_name:"default" table_name:"bob_node_test" column_name:"age" party_code:"alice_node" visibility:PLAINTEXT database_name:"default" table_name:"bob_node_test" column_name:"height" party_code:"alice_node" visibility:PLAINTEXT database_name:"default" table_name:"bob_node_test" column_name:"id" party_code:"alice_node" visibility:PLAINTEXT database_name:"default" table_name:"bob_node_test" column_name:"income" party_code:"alice_node" visibility:PLAINTEXT database_name:"default" table_name:"bob_node_test" column_name:"name"] 2024-02-23 11:03:47.22311 INFO app.go:167 [Translator] execution plan: digraph G { 0 [label="runsql:{in:[],out:[Out:{t_0,},],attr:[sql:select id from fate_flow.alice_node,table_refs:[fate_flow.alice_node],],url:[10.1.1.2:8022,]}"] 1 [label="runsql:{in:[],out:[Out:{t_1,},],attr:[sql:select id from fate_flow.bob_node,table_refs:[fate_flow.bob_node],],url:[10.1.1.1:8022,]}"] 2 [label="join:{in:[Left:{t_0,},Right:{t_1,},],out:[LeftJoinIndex:{t_2,},RightJoinIndex:{t_3,},],attr:[input_party_codes:[bob_test alice_node],join_type:0,],url:[10.1.1.2:8022,10.1.1.1:8022,]}"] 3 [label="filter_by_index:{in:[Data:{t_0,},RowsIndexFilter:{t_2,},],out:[Out:{t_4,},],attr:[],url:[10.1.1.2:8022,]}"] 4 [label="filter_by_index:{in:[Data:{t_1,},RowsIndexFilter:{t_3,},],out:[Out:{t_5,},],attr:[],url:[10.1.1.1:8022,]}"] 5 [label="copy:{in:[In:{t_4,},],out:[Out:{t_6,},],attr:[input_party_codes:bob_test,output_party_codes:alice_node,],url:[10.1.1.2:8022,10.1.1.1:8022,]}"] 6 [label="publish:{in:[In:{t_6,},],out:[Out:{t_7,},],attr:[],url:[10.1.1.1:8022,]}"] 0 -> 2 [label = "t_0:{id:PRIVATE:INT64}"] 0 -> 3 [label = "t_0:{id:PRIVATE:INT64}"] 1 -> 2 [label = "t_1:{id:PRIVATE:INT64}"] 1 -> 4 [label = "t_1:{id:PRIVATE:INT64}"] 2 -> 3 [label = "t_2:{id:PRIVATE:INT64}"] 2 -> 4 [label = "t_3:{id:PRIVATE:INT64}"] 3 -> 5 [label = "t_4:{id:PRIVATE:INT64}"] 5 -> 6 [label = "t_6:{id:PRIVATE:INT64}"] }

2024-02-23 11:03:57.22311 INFO sync_executor.go:143 |RequestID:|SessionID:29bdcedc-d1f8-11ee-9e1e-0242ac130002|ActionName:Executor@RunExecutionPlan|CostTime:10.00276133s|Reason:|ErrorMsg:|Request:{"session_params":{"party_code":"bob_test","parties":[{"code":"alice_node","name":"alice_node","host":"10.1.1.1:8022"},{"code":"bob_test","name":"bob_test","host":"10.1.1.2:8022","rank":1}],"session_id":"29bdcedc-d1f8-11ee-9e1e-0242ac130002","spu_runtime_cfg":{"protocol":"SEMI2K","field":"FM64","public_random_seed":"7055448383781434746","sigmoid_mode":"SIGMOID_REAL","ttp_beaver_config":{}}},"nodes":{"0":{"node_name":"runsql.0","op_type":"RunSQL","outputs":{"Out":{"tensors":[{"name":"default.alice_node_test.id.0","elem_type":"INT64","option":"REFERENCE","annotation":{"status":"TENSORSTATUS_PRIVATE"}}]}},"attributes":{"sql":{"t":{"name":".0","elem_type":"STRING","ss":{"ss":["select id from fate_flow.alice_node"]}}},"table_refs":{"t":{"name":".0","shape":{"dim":[{"dim_value":"1"}]},"elem_type":"STRING","ss":{"ss":["fate_flow.alice_node"]}}}}},"2":{"node_name":"join.2","op_type":"Join","inputs":{"Left":{"tensors":[{"name":"default.alice_node_test.id.0","elem_type":"INT64","option":"REFERENCE","annotation":{"status":"TENSORSTATUS_PRIVATE"}}]},"Right":{"tensors":[{"name":"default.bob_node_test.id.1","elem_type":"INT64","option":"REFERENCE","annotation":{"status":"TENSORSTATUS_PRIVATE"}}]}},"outputs":{"LeftJoinIndex":{"tensors":[{"name":"default.alice_node_test.id.2","elem_type":"INT64","option":"REFERENCE","annotation":{"status":"TENSORSTATUS_PRIVATE"}}]},"RightJoinIndex":{"tensors":[{"name":"default.bob_node_test.id.3","elem_type":"INT64","option":"REFERENCE","annotation":{"status":"TENSORSTATUS_PRIVATE"}}]}},"attributes":{"input_party_codes":{"t":{"name":".0","shape":{"dim":[{"dim_value":"2"}]},"elem_type":"STRING","ss":{"ss":["bob_test","alice_node"]}}},"join_type":{"t":{"name":".0","elem_type":"INT64","i64s":{"i64s":["0"]}}}}},"3":{"node_name":"filter_by_index.3","op_type":"FilterByIndex","inputs":{"Data":{"tensors":[{"name":"default.alice_node_test.id.0","elem_type":"INT64","option":"REFERENCE","annotation":{"status":"TENSORSTATUS_PRIVATE"}}]},"RowsIndexFilter":{"tensors":[{"name":"default.alice_node_test.id.2","elem_type":"INT64","option":"REFERENCE","annotation":{"status":"TENSORSTATUS_PRIVATE"}}]}},"outputs":{"Out":{"tensors":[{"name":"default.alice_node_test.id.4","elem_type":"INT64","option":"REFERENCE","annotation":{"status":"TENSORSTATUS_PRIVATE"}}]}}},"5":{"node_name":"copy.5","op_type":"Copy","inputs":{"In":{"tensors":[{"name":"default.alice_node_test.id.4","elem_type":"INT64","option":"REFERENCE","annotation":{"status":"TENSORSTATUS_PRIVATE"}}]}},"outputs":{"Out":{"tensors":[{"name":"default.alice_node_test.id.6","elem_type":"INT64","option":"REFERENCE","annotation":{"status":"TENSORSTATUS_PRIVATE"}}]}},"attributes":{"input_party_codes":{"t":{"name":".0","elem_type":"STRING","ss":{"ss":["bob_test"]}}},"output_party_codes":{"t":{"name":".0","elem_type":"STRING","ss":{"ss":["alice_node"]}}}}}},"policy":{"worker_num":1,"subdags":[{"jobs":[{"node_ids":["0"]}]},{},{"jobs":[{"node_ids":["2"]}],"need_call_barrier_after_jobs":true},{"jobs":[{"node_ids":["3"]}]},{},{"jobs":[{"node_ids":["5"]}],"need_call_barrier_after_jobs":true},{}]}}|PartyCode:bob_test|Url:http://10.1.1.2:8022/SCQLEngineService/RunExecutionPlan 2024-02-23 11:03:57.22311 ERROR sync_executor.go:89 |RequestID:|SessionID:29bdcedc-d1f8-11ee-9e1e-0242ac130002|ActionName:EngineStub@RunExecutionPlan|CostTime:10.003381025s|Reason:InvalidResponse|ErrorMsg:Error: code=320, msg="RunExecutionPlan create session(29bdcedc-d1f8-11ee-9e1e-0242ac130002) failed, catch std::exception=[external/yacl/yacl/link/context.cc:140] connect to mesh failed, failed to setup connection to rank=0 "|Request: 2024-02-23 11:03:57.22311 ERROR submit_and_get_handler.go:65 |RequestID:1234|SessionID:29bdcedc-d1f8-11ee-9e1e-0242ac130002|ActionName:SCDBSubmitAndGetHandler@/public/submit_and_get|CostTime:10.010733871s|Reason:InvalidRequest|ErrorMsg:RunExecutionPlan create session(29bdcedc-d1f8-11ee-9e1e-0242ac130002) failed, catch std::exception=[external/yacl/yacl/link/context.cc:140] connect to mesh failed, failed to setup connection to rank=0 |Request:user:{user:{account_system_type:NATIVE_USER native_user:{name:"alice_node"}}} query:"select a.id from alice_node_test as a join bob_node_test as b on a.id=b.id;" biz_request_id:"1234" db_name:"default"|ClientIP:172.19.0.1 2024-02-23 11:03:57.22311 INFO server.go:146 |GIN|status=200|method=POST|path=/public/submit_and_get|ip=172.19.0.1|latency=10.010783741s| 2024-02-23 11:04:17.22311 INFO sync_executor.go:143 |RequestID:|SessionID:29bdcedc-d1f8-11ee-9e1e-0242ac130002|ActionName:Executor@RunExecutionPlan|CostTime:30.002010816s|Reason:|ErrorMsg:|Request:{"session_params":{"party_code":"alice_node","parties":[{"code":"alice_node","name":"alice_node","host":"10.1.1.1:8022"},{"code":"bob_test","name":"bob_test","host":"10.1.1.2:8022","rank":1}],"session_id":"29bdcedc-d1f8-11ee-9e1e-0242ac130002","spu_runtime_cfg":{"protocol":"SEMI2K","field":"FM64","public_random_seed":"7055448383781434746","sigmoid_mode":"SIGMOID_REAL","ttp_beaver_config":{}}},"nodes":{"1":{"node_name":"runsql.1","op_type":"RunSQL","outputs":{"Out":{"tensors":[{"name":"default.bob_node_test.id.1","elem_type":"INT64","option":"REFERENCE","annotation":{"status":"TENSORSTATUS_PRIVATE"}}]}},"attributes":{"sql":{"t":{"name":".0","elem_type":"STRING","ss":{"ss":["select id from fate_flow.bob_node"]}}},"table_refs":{"t":{"name":".0","shape":{"dim":[{"dim_value":"1"}]},"elem_type":"STRING","ss":{"ss":["fate_flow.bob_node"]}}}}},"2":{"node_name":"join.2","op_type":"Join","inputs":{"Left":{"tensors":[{"name":"default.alice_node_test.id.0","elem_type":"INT64","option":"REFERENCE","annotation":{"status":"TENSORSTATUS_PRIVATE"}}]},"Right":{"tensors":[{"name":"default.bob_node_test.id.1","elem_type":"INT64","option":"REFERENCE","annotation":{"status":"TENSORSTATUS_PRIVATE"}}]}},"outputs":{"LeftJoinIndex":{"tensors":[{"name":"default.alice_node_test.id.2","elem_type":"INT64","option":"REFERENCE","annotation":{"status":"TENSORSTATUS_PRIVATE"}}]},"RightJoinIndex":{"tensors":[{"name":"default.bob_node_test.id.3","elem_type":"INT64","option":"REFERENCE","annotation":{"status":"TENSORSTATUS_PRIVATE"}}]}},"attributes":{"input_party_codes":{"t":{"name":".0","shape":{"dim":[{"dim_value":"2"}]},"elem_type":"STRING","ss":{"ss":["bob_test","alice_node"]}}},"join_type":{"t":{"name":".0","elem_type":"INT64","i64s":{"i64s":["0"]}}}}},"4":{"node_name":"filter_by_index.4","op_type":"FilterByIndex","inputs":{"Data":{"tensors":[{"name":"default.bob_node_test.id.1","elem_type":"INT64","option":"REFERENCE","annotation":{"status":"TENSORSTATUS_PRIVATE"}}]},"RowsIndexFilter":{"tensors":[{"name":"default.bob_node_test.id.3","elem_type":"INT64","option":"REFERENCE","annotation":{"status":"TENSORSTATUS_PRIVATE"}}]}},"outputs":{"Out":{"tensors":[{"name":"default.bob_node_test.id.5","elem_type":"INT64","option":"REFERENCE","annotation":{"status":"TENSORSTATUS_PRIVATE"}}]}}},"5":{"node_name":"copy.5","op_type":"Copy","inputs":{"In":{"tensors":[{"name":"default.alice_node_test.id.4","elem_type":"INT64","option":"REFERENCE","annotation":{"status":"TENSORSTATUS_PRIVATE"}}]}},"outputs":{"Out":{"tensors":[{"name":"default.alice_node_test.id.6","elem_type":"INT64","option":"REFERENCE","annotation":{"status":"TENSORSTATUS_PRIVATE"}}]}},"attributes":{"input_party_codes":{"t":{"name":".0","elem_type":"STRING","ss":{"ss":["bob_test"]}}},"output_party_codes":{"t":{"name":".0","elem_type":"STRING","ss":{"ss":["alice_node"]}}}}},"6":{"node_name":"publish.6","op_type":"Publish","inputs":{"In":{"tensors":[{"name":"default.alice_node_test.id.6","elem_type":"INT64","option":"REFERENCE","annotation":{"status":"TENSORSTATUS_PRIVATE"}}]}},"outputs":{"Out":{"tensors":[{"name":"id.7","elem_type":"INT64"}]}}}},"policy":{"worker_num":1,"subdags":[{},{"jobs":[{"node_ids":["1"]}]},{"jobs":[{"node_ids":["2"]}],"need_call_barrier_after_jobs":true},{},{"jobs":[{"node_ids":["4"]}]},{"jobs":[{"node_ids":["5"]}],"need_call_barrier_after_jobs":true},{"jobs":[{"node_ids":["6"]}]}]}}|PartyCode:alice_node|Url:http://10.1.1.1:8022/SCQLEngineService/RunExecutionPlan

songsong124 commented 8 months ago

scdb模式下,一个engine只有一个地址(通过alter user配置),这个地址需要在各个机器上都可以访问(engine A也只知道SCDB传下来的engine B的地址,没法知晓代理地址);验证的话,确认下scdb日志中的engine B地址是不是域内地址(域外无法访问)即可 解决思路: 1.将scdb部署在另一个机器上,这样就不存在engine地址不一致的问题; 2.或者使用p2p模式,p2p模式可以给engine配置域内域外地址。 我的拓扑图如下: image scdb 部署到另外一个机器上依然会存在问题吧、我的engine 不能直连

jingshi-ant commented 8 months ago

你是对的,代理的话scdb中心化模式不能work(因为scdb能访问的代理地址,其他机器没法访问。。)

songsong124 commented 8 months ago

好的,感谢

songsong124 commented 8 months ago

scdb模式下,一个engine只有一个地址(通过alter user配置),这个地址需要在各个机器上都可以访问(engine A也只知道SCDB传下来的engine B的地址,没法知晓代理地址);验证的话,确认下scdb日志中的engine B地址是不是域内地址(域外无法访问)即可 解决思路: 1.将scdb部署在另一个机器上,这样就不存在engine地址不一致的问题; 2.或者使用p2p模式,p2p模式可以给engine配置域内域外地址。 我的拓扑图如下: image scdb 部署到另外一个机器上依然会存在问题吧、我的engine 不能直连

请问p2p 能满足我上面这个图吗?域内域外 地址在哪里配置?

jingshi-ant commented 8 months ago

https://www.secretflow.org.cn/docs/scql/0.5.0b2/en-US/reference/p2p-deploy-config image image

songsong124 commented 8 months ago

A(brokerctl) 可以进行赋权、A 和B 的表,对B 的表赋权只需要连接 B 的broker 就可以。 那不是装有brokerctl 就相当于有(A/B)全部数据操作权限?

tongke6 commented 8 months ago

@songsong124

broker 有两个端口,一个是 intra 端口,只限域内访问(本机构)。另一个是 inter 接口,是开放给 peer 访问的。

访问控制是通过端口隔离的。很容易可以加一个前置的网关,仅仅开放 inter 接口给外部。这样,A 就无法访问 B 的 intra 服务,同理 B 也无法访问 A 的 intra 服务。

songsong124 commented 8 months ago

@songsong124

broker 有两个端口,一个是 intra 端口,只限域内访问(本机构)。另一个是 inter 接口,是开放给 peer 访问的。

访问控制是通过端口隔离的。很容易可以加一个前置的网关,仅仅开放 inter 接口给外部。这样,A 就无法访问 B 的 intra 服务,同理 B 也无法访问 A 的 intra 服务。

intra :默认8080 端口,授权时需要对外开启? [inter 默认8081 engine 8003 查询的时候需要对外开启?] 这样理解吗?

tongke6 commented 8 months ago

你这里的对外开启是指?

songsong124 commented 8 months ago

A 授权 B 表,需要A 能访问B 的8080 端口、如果不想让A授权,则关闭A 能访问B 的8080 权限。可使用iptables 限制,或者限制代理端口。

tongke6 commented 8 months ago

一般情况下,intra 端口只对内可以访问,请不要开放给外部机构。 inter 接口和 engine 的接口都可以开发给对方访问。

songsong124 commented 8 months ago

OK,谢谢