secretflow / scql

SCQL (Secure Collaborative Query Language) is a system that allows multiple distrusting parties to run joint analysis without revealing their private data.
https://www.secretflow.org.cn/docs/scql/en/
Apache License 2.0
125 stars 45 forks source link

scql给postgresql数据源创建表报错表找不到 #372

Open oaHeZgnoS opened 8 hours ago

oaHeZgnoS commented 8 hours ago

Issue Type

Running

Have you searched for existing issues?

No

OS Platform and Distribution

Linux fatenode2 6.8.7-1.el7.elrepo.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Apr 17 15:04:09 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux

SCQL Version

0.5.0b2

What happend and What you expected to happen.

在为postgresql数据源createTable时,ref_table传“数据库.表名”或“schema.表名”,都可以建表成功,但最后到执行sql时,报错表找不到:

// public.user_credit 这个public是postgresql数据库名
{"status":{"code":300, "message":"Error: code=300, msg=\"runQuery Execute err: Error: code=320, msg=\"RunExecutionPlan run jobs(b96e2050-8a01-11ef-8e51-0242ac140017) failed, catch std::exception=[engine/datasource/odbc_adaptor.cc:56] catch unexpected Poco::Data::DataException: MySQL: [MySQL]: [Comment]: mysql_stmt_prepare error\t[mysql_stmt_error]: Table 'public.user_credit' doesn't exist\t[mysql_stmt_errno]: 1146\t[mysql_stmt_sqlstate]: 42S02\t[statemnt]: select user_credit.credit_rank from public.user_credit \"\"", "details":[]}, "affected_rows":"0", "warnings":[], "cost_time_s":0, "out_columns":[]}

// postgres.user_credit 这个postgres是postgresql的schema名
{"status":{"code":300, "message":"Error: code=300, msg=\"runQuery Execute err: Error: code=320, msg=\"RunExecutionPlan run jobs(84edb27d-8a02-11ef-8e51-0242ac140017) failed, catch std::exception=[engine/datasource/odbc_adaptor.cc:56] catch unexpected Poco::Data::DataException: MySQL: [MySQL]: [Comment]: mysql_stmt_prepare error\t[mysql_stmt_error]: Table 'postgres.user_credit' doesn't exist\t[mysql_stmt_errno]: 1146\t[mysql_stmt_sqlstate]: 42S02\t[statemnt]: select user_credit.credit_rank from postgres.user_credit \"\"", "details":[]}, "affected_rows":"0", "warnings":[], "cost_time_s":0, "out_columns":[]}

Configuration used to run SCQL.

默认配置

SCQL log output.

# alice.log

2024-10-14 08:16:29.10148 INFO session.go:235 add endpoint 192.168.1.242:8003 for party 9999
2024-10-14 08:16:29.10148 INFO query_handler.go:66 create session 9d2d79ad-8a04-11ef-8e51-0242ac140017 with query 'SELECT alice.credit_rank FROM alice;' in project erhnigoengewuiigewnierlgehggwe
2024-10-14 08:16:29.10148 INFO query_handler.go:230 create query runner for job 9d2d79ad-8a04-11ef-8e51-0242ac140017
2024-10-14 08:16:29.10148 INFO query_handler.go:235 get source tables [{dbName: tableName:alice dbType:0}] in project erhnigoengewuiigewnierlgehggwe from storage
2024-10-14 08:16:29.10148 INFO query_handler.go:243 work parties: [9999]; data parties: [9999] for job 9d2d79ad-8a04-11ef-8e51-0242ac140017
2024-10-14 08:16:29.10148 INFO checksum.go:122 save local checksum table schema: [92 84 12 175 15 115 93 190 142 1];ccl: [45 103 158 181 110 149 197 96 23 75] for party 9999
2024-10-14 08:16:29.10148 INFO query_handler.go:285 distribute query completed for job 9d2d79ad-8a04-11ef-8e51-0242ac140017
2024-10-14 08:16:29.10148 INFO query_runner.go:458 Execution Plan:
digraph G {
0 [label="runsql:{in:[],out:[Out:{t_0,},],attr:[sql:select user_credit.credit_rank from postgres.user_credit,table_refs:[postgres.user_credit],],party:[9999,]}"]
1 [label="publish:{in:[In:{t_0,},],out:[Out:{t_1,},],attr:[],party:[9999,]}"]
0 -> 1 [label = "t_0:{credit_rank:PRIVATE:INT64}"]
}

2024-10-14 08:16:29.10148 INFO executor.go:159 |RequestID:|SessionID:9d2d79ad-8a04-11ef-8e51-0242ac140017|ActionName:Executor@RunExecutionPlan|CostTime:23.439299ms|Reason:|ErrorMsg:|Request:{"session_params":{"party_code":"9999", "parties":[{"code":"9999", "name":"9999", "host":"192.168.1.242:8003", "public_key":"MCowBQYDK2VwAyEA0aHce4FDCCQQIoIWZ6g9N5NdaP7Z3/2W2oLynrPbWEc="}], "session_id":"9d2d79ad-8a04-11ef-8e51-0242ac140017", "spu_runtime_cfg":{"protocol":"SEMI2K", "field":"FM64"}}, "graph":{"nodes":{"0":{"node_name":"runsql.0", "op_type":"RunSQL", "outputs":{"Out":{"tensors":[{"name":"erhnigoengewuiigewnierlgehggwe.alice.credit_rank.0", "elem_type":"INT64", "option":"REFERENCE", "annotation":{"status":"TENSORSTATUS_PRIVATE"}, "ref_num":1}]}}, "attributes":{"sql":{"t":{"name":".0", "elem_type":"STRING", "string_data":["select user_credit.credit_rank from postgres.user_credit"]}}, "table_refs":{"t":{"name":".0", "shape":{"dim":[{"dim_value":"1"}]}, "elem_type":"STRING", "string_data":["postgres.user_credit"]}}}}, "1":{"node_name":"publish.1", "op_type":"Publish", "inputs":{"In":{"tensors":[{"name":"erhnigoengewuiigewnierlgehggwe.alice.credit_rank.0", "elem_type":"INT64", "option":"REFERENCE", "annotation":{"status":"TENSORSTATUS_PRIVATE"}, "ref_num":1}]}}, "outputs":{"Out":{"tensors":[{"name":"credit_rank.1", "elem_type":"STRING", "annotation":{"status":"TENSORSTATUS_PRIVATE"}, "string_data":["credit_rank"]}]}}}}, "policy":{"worker_num":1, "subdags":[{"jobs":[{"node_ids":["0"]}]}, {"jobs":[{"node_ids":["1"]}]}]}}, "callback_url":"http://192.168.1.242:8011/intra/cb/engine"}|PartyCode:9999|Url:http://192.168.1.242:8003/SCQLEngineService/RunExecutionPlan
2024-10-14 08:16:29.10148 ERROR executor.go:98 |RequestID:|SessionID:9d2d79ad-8a04-11ef-8e51-0242ac140017|ActionName:EngineStub@RunExecutionPlan|CostTime:23.996098ms|Reason:InvalidResponse|ErrorMsg:Error: code=320, msg="RunExecutionPlan run jobs(9d2d79ad-8a04-11ef-8e51-0242ac140017) failed, catch std::exception=[engine/datasource/odbc_adaptor.cc:56] catch unexpected Poco::Data::DataException: MySQL: [MySQL]: [Comment]: mysql_stmt_prepare error     [mysql_stmt_error]: Table 'postgres.user_credit' doesn't exist  [mysql_stmt_errno]: 1146        [mysql_stmt_sqlstate]: 42S02      [statemnt]: select user_credit.credit_rank from postgres.user_credit "|Request:
2024-10-14 08:16:29.10148 ERROR common.go:307 |RequestID:|RequestParty:|SessionID:|ActionName:Intra@DoQuery|CostTime:31.614825ms|Reason:|ErrorMsg:Error: code=300, msg="runQuery Execute err: Error: code=320, msg="RunExecutionPlan run jobs(9d2d79ad-8a04-11ef-8e51-0242ac140017) failed, catch std::exception=[engine/datasource/odbc_adaptor.cc:56] catch unexpected Poco::Data::DataException: MySQL: [MySQL]: [Comment]: mysql_stmt_prepare error     [mysql_stmt_error]: Table 'postgres.user_credit' doesn't exist  [mysql_stmt_errno]: 1146        [mysql_stmt_sqlstate]: 42S02      [statemnt]: select user_credit.credit_rank from postgres.user_credit ""|Request:project_id:"erhnigoengewuiigewnierlgehggwe"  query:"SELECT alice.credit_rank FROM alice;"
2024-10-14 08:16:29.10148 INFO server.go:124 |GIN|status=200|method=POST|path=/intra/query|ip=192.168.2.42|latency=31.694208ms|
tongke6 commented 7 hours ago

@oaHeZgnoS postgres 比较特殊,它的 full table qualifier 为 db_name.schema_name.table_name ,为了能够访问到,可以采用如下做法:

  1. postgres 的连接串里指定 dbname
  2. create table 时的 ref table 填写成 schema_name.table_name 的形式
tongke6 commented 7 hours ago

duplicate with https://github.com/secretflow/scql/issues/301

oaHeZgnoS commented 7 hours ago

一句惊醒梦中人。engine里会配数据源的地址,核对了下,只有mysql,没有postgresql: { "datasources": [ { "id": "ds001", "name": "u1000", "kind": "MYSQL", "connection_str": "db=secure_net_temporary_db;user=root;password=fatE168dev;host=192.168.1.242;auto-reconnect=true" } ], "rules": [ { "db": "*", "table": "*", "datasource_id": "ds001" } ] }

如果postgresql连接是192.168.2.195:5432,用户名postgres,密码123456,那么,所以是不是要改成: { "datasources": [ { "id": "ds001", "name": "mysql db", "kind": "MYSQL", "connection_str": "db=secure_net_temporary_db;user=root;password=fatE168dev;host=192.168.1.242;auto-reconnect=true" },{ "id": "ds002", "name": "postgres db", "kind": "POSTGRESQL", "connection_str": "db=postgres;user=postgres;password=123456;host=192.168.2.195;auto-reconnect=true" } ], "rules": [ { "db": "*", "table": "*", "datasource_id": "ds001" },{ "db": "*", "table": "*", "datasource_id": "ds002" } ] }

tongke6 commented 6 hours ago

官网配置文档里有说明,请阅读对应的文档哈。上面关联的 issue 里也有

oaHeZgnoS commented 6 hours ago

--embed_router_conf={"datasources":[{"id":"ds001","name":"u1000","kind":"MYSQL","connection_str":"db=secure_net_temporary_db;user=root;password=fatE168dev;host=192.168.1.241;auto-reconnect=true"},{"id":"ds002","name":"u1000_2","kind":"POSTGRESQL","connection_str":"db=postgres;user=postgres;password=123456;host=192.168.2.195;auto-reconnect=true"}],"rules":[{"db":"","table":"","datasource_id":"ds001"},{"db":"","table":"","datasource_id":"ds002"}]}

这样改配置以后,重启engine报错: 2024-10-14 09:29:22.949 [info] [main.cc:main:346] [scqlengine] Adding MuxReceiverService into brpc server 2024-10-14 09:29:22.951 [info] [main.cc:BuildRouter:177] [scqlengine] Building EmbedRouter from json conf 2024-10-14 09:29:22.989 [error] [main.cc:main:364] [scqlengine] Fail to build engine service, msg=[engine/datasource/embed_router.cc:125] Fail to set default route rule: exists already. 2024-10-14 09:29:23.400 [info] [main.cc:main:346] [scqlengine] Adding MuxReceiverService into brpc server 2024-10-14 09:29:23.403 [info] [main.cc:BuildRouter:177] [scqlengine] Building EmbedRouter from json conf 2024-10-14 09:29:23.440 [error] [main.cc:main:364] [scqlengine] Fail to build engine service, msg=[engine/datasource/embed_router.cc:125] Fail to set default route rule: exists already. 2024-10-14 09:29:23.972 [info] [main.cc:main:346] [scqlengine] Adding MuxReceiverService into brpc server 2024-10-14 09:29:23.975 [info] [main.cc:BuildRouter:177] [scqlengine] Building EmbedRouter from json conf 2024-10-14 09:29:24.015 [error] [main.cc:main:364] [scqlengine] Fail to build engine service, msg=[engine/datasource/embed_router.cc:125] Fail to set default route rule: exists already. 2024-10-14 09:29:24.732 [info] [main.cc:main:346] [scqlengine] Adding MuxReceiverService into brpc server 2024-10-14 09:29:24.733 [info] [main.cc:BuildRouter:177] [scqlengine] Building EmbedRouter from json conf 2024-10-14 09:29:24.771 [error] [main.cc:main:364] [scqlengine] Fail to build engine service, msg=[engine/datasource/embed_router.cc:125] Fail to set default route rule: exists already. 2024-10-14 09:29:25.891 [info] [main.cc:main:346] [scqlengine] Adding MuxReceiverService into brpc server 2024-10-14 09:29:25.894 [info] [main.cc:BuildRouter:177] [scqlengine] Building EmbedRouter from json conf 2024-10-14 09:29:25.933 [error] [main.cc:main:364] [scqlengine] Fail to build engine service, msg=[engine/datasource/embed_router.cc:125] Fail to set default route rule: exists already. 2024-10-14 09:29:27.851 [info] [main.cc:main:346] [scqlengine] Adding MuxReceiverService into brpc server 2024-10-14 09:29:27.853 [info] [main.cc:BuildRouter:177] [scqlengine] Building EmbedRouter from json conf 2024-10-14 09:29:27.890 [error] [main.cc:main:364] [scqlengine] Fail to build engine service, msg=[engine/datasource/embed_router.cc:125] Fail to set default route rule: exists already. 2024-10-14 09:29:31.423 [info] [main.cc:main:346] [scqlengine] Adding MuxReceiverService into brpc server 2024-10-14 09:29:31.426 [info] [main.cc:BuildRouter:177] [scqlengine] Building EmbedRouter from json conf 2024-10-14 09:29:31.468 [error] [main.cc:main:364] [scqlengine] Fail to build engine service, msg=[engine/datasource/embed_router.cc:125] Fail to set default route rule: exists already.

请问可能还是哪里不太对呢,是否可能因为只能有一份rule?

oaHeZgnoS commented 6 hours ago

试过了 只配置postgresql的是可以成功查询(空格分隔)。设置mysql和postgresql就失败: --embed_router_conf={"datasources":[{"id":"ds001","name":"u1000","kind":"MYSQL","connection_str":"db=secure_net_temporary_db;user=root;password=fatE168dev;host=192.168.1.242;auto-reconnect=true"},{"id":"ds002","name":"u1000_2","kind":"POSTGRESQL","connection_str":"dbname=postgres user=postgres password=123456 host=192.168.2.195 port=5432"}],"rules":[{"db":"","table":"","datasource_id":"ds001"},{"db":"","table":"","datasource_id":"ds002"}]}

推测是不支持双数据源?