secretflow / scql

SCQL (Secure Collaborative Query Language) is a system that allows multiple distrusting parties to run joint analysis without revealing their private data.
https://www.secretflow.org.cn/docs/scql/en/
Apache License 2.0
126 stars 46 forks source link

SCQL中心化部署中 如何导入外部数据库 #355

Open sabakioukenasai opened 1 month ago

sabakioukenasai commented 1 month ago

我按照官网的 基于中心化部署的快速开始 在单机上部署的SCQL系统。 注意到该手册的步骤中有一步是由alice创建了表格ta,指令如下:

CREATE TABLE demo.ta (ID string, credit_rank int, income int, age int) REF_TABLE=alice.user_credit DB_TYPE='mysql';

通过查看文件我发现 mysql/initdb/ 目录下的 alice_init.sql文件,该文件创建了一个表格 alice.user_credit 并且在该表格中插入了一些数据。上述指令在创建表格 demo.ta 时使用 REF_TABLE 将这两个表格关联到了一起。

我希望在SCQL中进行大数据集测试,而现有的 demo.tademo.tb 不能满足我的需求,因此我在想如何能够将外部数据库导入到SCQL中。 我选取的数据库是 https://github.com/datacharmer/test_db.git,在该数据库的根目录下我找到了 employees.sql 配置文件和储存数据的 .dump 文件,并将他们移动到mysql/initdb/ 目录下,如下图所示: image

其中 employees.sql 文件的部分内容如下:

DROP DATABASE IF EXISTS `employees`;
CREATE DATABASE IF NOT EXISTS `employees`;
USE `employees`;

SELECT 'CREATING DATABASE STRUCTURE' as 'INFO';

DROP TABLE IF EXISTS dept_emp,
                     dept_manager,
                     titles,
                     salaries, 
                     employees, 
                     departments;

CREATE TABLE departments (
    dept_no     CHAR(4)         NOT NULL,
    dept_name   VARCHAR(40)     NOT NULL,
    PRIMARY KEY (dept_no),
    UNIQUE  KEY (dept_name)
);

# -- snip --
SELECT 'LOADING departments' as 'INFO';
source load_departments.dump ;

参考前文创建表格 demo.ta 语法格式,我切换到用户alice创建了表格 demo.departments

CREATE TABLE demo.departments (dept_no string, dept_name string) REF_TABLE=employees.departments DB_TYPE='mysql';

并且赋予了用户 alice 响应CCL权限

GRANT SELECT PLAINTEXT(dept_no, dept_name) on demo.departments TO alice

此时可以使用 DESCRIBE demo.departments 查看到该表信息

alice> DESCRIBE demo.departments
[fetch]
2 rows in set: (5.099977ms)
+-----------+--------+
|   Field   |  Type  |
+-----------+--------+
| dept_no   | string |
| dept_name | string |
+-----------+--------+

但是最后在使用alice对该表进行查询时却无法成功,它提示我找不到数据库 employees这里我不知道应该如何解决,不知道还有哪些细节没有做好呢

alice> SELECT * FROM demo.departments
[fetch]err: Code: 320, message:RunExecutionPlan run jobs(785aac04-7051-11ef-a51f-0242ac120003) failed, catch std::exception=[engine/datasource/odbc_adaptor.cc:44] catch unexpected Poco::Data::DataException: MySQL: [MySQL]: [Comment]: mysql_stmt_prepare error    [mysql_stmt_error]: Unknown database 'employees'        [mysql_stmt_errno]: 1049        [mysql_stmt_sqlstate]: 42000    [statemnt]: select departments.dept_name,departments.dept_no from employees.departments
zimu-yuxi commented 1 month ago

mysql中是否初始化了employees. departments

sabakioukenasai commented 1 month ago

mysql中是否初始化了employees. departments

请问mysql中初始化 employees.departments的指令是什么呢? 我看alice_init.sql文件里面是先创建了alice.user_credit这个表格,然后通过insert指令插入了若干条目,这样之后,后续创建的demo.ta就是可以之间关联上alice.user_credit的。 然后employees. departments的话也是在employees.sql这个文件里面创建了,但是因为这个表格比较大,所以是采用SOURCE导入的方式从.dump文件里面导入数据,不知道这是你所说的初始化吗

zimu-yuxi commented 1 month ago

先确认下你的mysql中是否有这个db和table

sabakioukenasai commented 1 month ago

我运行DESCIRBE databases;似乎只有demo这一个数据库。请问是这样看吗?

root> show databases;
[fetch]
1 rows in set: (3.381651ms)
+----------+
| Database |
+----------+
| demo     |
+----------+

但是我同样无法看到alice_init.sql中创建的 alice数据库 和其中的表格 user_credit

CREATE DATABASE IF NOT EXISTS `alice`;
USE `alice`;

DROP TABLE IF EXISTS `user_credit`;
CREATE TABLE `user_credit` (
        `ID` varchar(64) NOT NULL,
        `credit_rank` int NOT NULL,
        `income` int NOT NULL,
        `age` int NOT NULL,
        PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

LOCK TABLES `user_credit` WRITE;
INSERT INTO `user_credit` VALUES ("id0001", 6, 100000, 20);
INSERT INTO `user_credit` VALUES ("id0002", 5, 90000, 19);
INSERT INTO `user_credit` VALUES ("id0003", 6, 89700, 32);
INSERT INTO `user_credit` VALUES ("id0005", 6, 607000, 30);
INSERT INTO `user_credit` VALUES ("id0006", 5, 30070, 25);
INSERT INTO `user_credit` VALUES ("id0007", 6, 12070, 28);
INSERT INTO `user_credit` VALUES ("id0008", 6, 200800, 50);
INSERT INTO `user_credit` VALUES ("id0009", 6, 607000, 30);
INSERT INTO `user_credit` VALUES ("id0010", 5, 30070, 25);
INSERT INTO `user_credit` VALUES ("id0011", 5, 12070, 28);
INSERT INTO `user_credit` VALUES ("id0012", 6, 200800, 50);
INSERT INTO `user_credit` VALUES ("id0013", 5, 30070, 25);
INSERT INTO `user_credit` VALUES ("id0014", 5, 12070, 28);
INSERT INTO `user_credit` VALUES ("id0015", 6, 200800, 18);
INSERT INTO `user_credit` VALUES ("id0016", 5, 30070, 26);
INSERT INTO `user_credit` VALUES ("id0017", 5, 12070, 27);
INSERT INTO `user_credit` VALUES ("id0018", 6, 200800, 16);
INSERT INTO `user_credit` VALUES ("id0019", 6, 30070, 25);
INSERT INTO `user_credit` VALUES ("id0020", 5, 12070, 28);
UNLOCK TABLES;
zimu-yuxi commented 1 month ago

mysql中看,不是在scql中。应该起的有一个mysql容器

sabakioukenasai commented 1 month ago

麻烦请教一下,怎么在MySQL中看呢?有文档可以参考吗

sabakioukenasai commented 1 month ago

查询mysql容器的id,然后进入该容器

[wwh@gbase8c-72 ~]$ sudo docker ps -a
CONTAINER ID        IMAGE                 COMMAND                   CREATED             STATUS             PORTS            NAMES
b92dfa0e78ef        mysql:latest       "docker-entrypoint.s…"     12 days ago         Up 12 days (healthy)       3306/tcp, 33060/tcp        scdb-tutorial_mysql_1

但是在该mysql容器中好像没办法用root登陆,不知道要怎么在这个MySQL容器中查看数据库内容。

[wwh@gbase8c-72 ~]$ sudo docker exec -it b92dfa0e78ef bash
bash-5.1# mysql
ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: NO)
bash-5.1# mysql -p
Enter password:
ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: YES)
github-actions[bot] commented 6 days ago

Stale issue message. Please comment to remove stale tag. Otherwise this issue will be closed soon.

sabakioukenasai commented 5 days ago

目前的想法是使用本地mysql数据库,而不是mysql镜像。修改docker-compose.yml

version: '3.3'
services:
  scdb:
    image: secretflow/scql:latest
    command:
      - /home/admin/bin/scdbserver
      - -config=/home/admin/configs/config.yml
    restart: always
    ports:
      - mode: host
        protocol: tcp
        published: ${SCDB_PORT:-8080}
        target: 8080
    volumes:
      - ./scdb/conf/:/home/admin/configs/
  engine_alice:
    cap_add:
      - NET_ADMIN
    command:
      - /home/admin/bin/scqlengine
      - --flagfile=/home/admin/engine/conf/gflags.conf
    image: secretflow/scql:latest
    volumes:
      - ./engine/alice/conf/gflags.conf:/home/admin/engine/conf/gflags.conf
      - ./engine/alice/conf/ed25519key.pem:/home/admin/engine/conf/ed25519key.pem
      - ./engine/alice/conf/authorized_profile.json:/home/admin/engine/conf/authorized_profile.json
  engine_bob:
    cap_add:
      - NET_ADMIN
    command:
      - /home/admin/bin/scqlengine
      - --flagfile=/home/admin/engine/conf/gflags.conf
    image: secretflow/scql:latest
    volumes:
      - ./engine/bob/conf/gflags.conf:/home/admin/engine/conf/gflags.conf
      - ./engine/bob/conf/ed25519key.pem:/home/admin/engine/conf/ed25519key.pem
      - ./engine/bob/conf/authorized_profile.json:/home/admin/engine/conf/authorized_profile.json
  # mysql:
  #   image: mysql:latest
  #   environment:
  #     - MYSQL_ROOT_PASSWORD=gIr4h6BtVWEZ8
  #     - TZ=Asia/Shanghai
  #   healthcheck:
  #     retries: 10
  #     test:
  #       - CMD
  #       - mysqladmin
  #       - ping
  #       - -h
  #       - mysql
  #     timeout: 20s
  #   expose:
  #     - "3306"
  #   restart: always
  #   volumes:
  #     - ./mysql/initdb:/docker-entrypoint-initdb.d

但是发现注释掉mysql的部分之后,scdb容器无法成功启动,一直在restarting。

[wwh@gbase8c-72 scdb-tutorial]$ sudo docker ps -a | more
[sudo] password for wwh: 
CONTAINER ID        IMAGE                                                        COMMAND                  CREATED             STATUS                         PORTS     
                                                                          NAMES
2d3b9c734517        secretflow/scql:latest                                       "/home/admin/bin/scd…"   12 minutes ago      Restarting (1) 8 seconds ago             
                                                                          scdb-tutorial_scdb_1
c13ea811bf7d        secretflow/scql:latest                                       "/home/admin/bin/scq…"   12 minutes ago      Up 12 minutes                            
                                                                          scdb-tutorial_engine_alice_1
3923680a3727        secretflow/scql:latest                                       "/home/admin/bin/scq…"   12 minutes ago      Up 12 minutes                            
                                                                          scdb-tutorial_engine_bob_1