secretflow / psi

The repo of Private Set Intersection(PSI) and Private Information Retrieval(PIR) from SecretFlow.
https://www.secretflow.org.cn/docs/psi
Apache License 2.0
26 stars 19 forks source link

[Bug]: spu 0.8.0b0版本无法兼容0.3.3b2版本 #98

Closed xiaohei-info closed 6 months ago

xiaohei-info commented 6 months ago

Issue Type

Usability

Modules Involved

PSI

Have you reproduced the bug with SPU HEAD?

Yes

Have you searched existing issues?

Yes

SPU Version

0.8.0b0

OS Platform and Distribution

CentOS Linux release 7.9.2009 (Core)

Python Version

3.9.19

Compiler Version

4.8.5

Current Behavior?

接收方 spu版本 0.8.0b0 发送方 spu版本 0.3.3b2 双方使用KKRT_PSI_2PC算法执行psi任务,无法连通

Standalone code to reproduce the issue

接收方:
simple_psi.py -rank 0 -party_ips 0.0.0.0:1010,发送方ip:1010 -protocol KKRT_PSI_2PC -in_path /tmp/data/y.csv -out_path /tmp/y.csv.out -field_names id --precheck_input true
发送方:
simple_psi.py -rank 1 -party_ips 接收方ip:1010,0.0.0.0:1010 -protocol KKRT_PSI_2PC -in_path /tmp/data/y.csv -out_path /tmp/y.csv.out -field_names id --precheck_input true

Relevant log output

接收方日志:
[2024-03-21 07:23:46.156] [info] [launch.cc:164] LEGACY PSI config: {"psi_type":"KKRT_PSI_2PC","broadcast_result":true,"input_params":{"path":"/tmp/data/y.csv","select_fields":["id"],"precheck":true},"output_params":{"path":"/tmp/y.csv.out","need_sort":true},"curve_type":"CURVE_25519","bucket_size":1048576}
[2024-03-21 07:23:46.156] [info] [bucket_psi.cc:400] bucket size set to 1048576
Fatal Python error: Aborted

Current thread 0x00007f6d2d891740 (most recent call first):
  File "/usr/local/bin/python3/lib/python3.9/site-packages/spu/psi.py", line 69 in bucket_psi
  File "/usr/local/bin/spu", line 93 in main
  File "/usr/local/bin/python3/lib/python3.9/site-packages/absl/app.py", line 254 in _run_main
  File "/usr/local/bin/python3/lib/python3.9/site-packages/absl/app.py", line 308 in run
  File "/usr/local/bin/spu", line 102 in <module>
Aborted

发送方日志:
2024-03-21 07:23:46.287 [info] [bucket_psi.cc:Init:228] bucket size set to 1048576
2024-03-21 07:23:46.288 [info] [bucket_psi.cc:Run:97] Begin sanity check for input file: /tmp/data/y.csv, precheck_switch:true
2024-03-21 07:23:46.289 [info] [csv_checker.cc:CsvChecker:121] Executing duplicated scripts: LC_ALL=C sort --buffer-size=1G --temporary-directory=/tmp --stable selected-keys.1711005826288592734 | LC_ALL=C uniq -d > duplicate-keys.1711005826288592734
Traceback (most recent call last):
  File "/usr/local/bin/spu", line 119, in <module>
    app.run(main)
  File "/usr/local/bin/python3/lib/python3.8/site-packages/absl/app.py", line 308, in run
    _run_main(main, args)
  File "/usr/local/bin/python3/lib/python3.8/site-packages/absl/app.py", line 254, in _run_main
    sys.exit(main(argv))
  File "/usr/local/bin/spu", line 112, in main
    report = psi.bucket_psi(setup_link(FLAGS.rank), config, FLAGS.ic_mode)
  File "/usr/local/bin/python3/lib/python3.8/site-packages/spu/psi.py", line 48, in bucket_psi
    report_str = libspu.libs.bucket_psi(link, config.SerializeToString(), ic_mode)
RuntimeError: what:
    [external/yacl/yacl/link/transport/channel.cc:117] Get data timeout, key=root:1:ALLGATHER
stacktrace:
#0 yacl::link::Context::RecvInternal()+0x7fe108a47417
secretflow/spu#1 yacl::link::AllGatherImpl<>()+0x7fe108a41bfd
secretflow/spu#2 yacl::link::AllGather()+0x7fe108a42193
secretflow/spu#3 spu::psi::SyncWait<>()+0x7fe10888462a
secretflow/spu#4 pybind11::cpp_function::initialize<>()::{lambda()#3}::_FUN()+0x7fe1077838e8
secretflow/spu#5 pybind11::cpp_function::dispatcher()+0x7fe10775e6d6
secretflow/spu#6 PyCFunction_Call+0x43bcda
Chrisdehe commented 6 months ago

已在私聊中沟通,问题总结如下: 目前仅是example,无法实现版本间兼容,但有计划会在下周的dev版本中支持,有更新后会同步到issue中。

6fj commented 6 months ago

hi @xiaohei-info

由于代码进行了重构,逻辑发生了调整,我们无法确保两个版本的兼容性问题,请使用同样的版本进行psi,建议使用新版本。 感谢!

xiaohei-info commented 6 months ago

hi @xiaohei-info

由于代码进行了重构,逻辑发生了调整,我们无法确保两个版本的兼容性问题,请使用同样的版本进行psi,建议使用新版本。 感谢!

请问意思是之前提到的dev版本也无法提供吗?

6fj commented 6 months ago

是的。 按照以上讨论的内容,两个版本的workflow已经发生了变化,强行兼容也很难维护,建议还是使用同样的版本,感谢理解。