secretflow / secretpad

SecretPad is a privacy-preserving computing web platform based on the Kuscia framework, designed to provide easy access to privacy-preserving data intelligence and machine learning functions.
https://www.secretflow.org.cn
Apache License 2.0
38 stars 24 forks source link

PSI:KKRT 和RR22 协议OOM #84

Open gxcuit opened 3 months ago

gxcuit commented 3 months ago

Hi, 我使用secretpad, docker 进行p2p 进行部署,版本信息如下

secretpadImage版本:0.7.1b0
secretflowServingImage版本:0.3.1b0
kusciaImage版本:0.8.0b0
secretflowImage版本:1.6.1b0

我发现,psi 只有ecdh协议正常,kkrt和rr22 报OOM的错误. 数据规模不大,1000左右。 docker 的内存提高到了8g

image

日志如下

stdout 信息如下

Details ``` 2024-06-11T18:30:03.624597928+08:00 stderr F WARNING:root:Since the GPL-licensed package `unidecode` is not installed, using Python's `unicodedata` package which yields worse results. 2024-06-11T18:30:05.619162898+08:00 stdout F 2024-06-11 10:30:05,618|alice|INFO|secretflow|entry.py:start_ray:59| ray_conf: RayConfig(ray_node_ip_address='mgey-jjhodnxr-node-35-0-global.alice.svc', ray_node_manager_port=21688, ray_object_manager_port=21689, ray_client_server_port=21690, ray_worker_ports=[], ray_gcs_port=21687) 2024-06-11T18:30:05.619255176+08:00 stdout F 2024-06-11 10:30:05,618|alice|INFO|secretflow|entry.py:start_ray:63| Trying to start ray head node at mgey-jjhodnxr-node-35-0-global.alice.svc, start command: RAY_BACKEND_LOG_LEVEL=debug RAY_grpc_enable_http_proxy=true OMP_NUM_THREADS=10 ray start --head --include-dashboard=false --disable-usage-stats --num-cpus=32 --node-ip-address=mgey-jjhodnxr-node-35-0-global.alice.svc --port=21687 --node-manager-port=21688 --object-manager-port=21689 --ray-client-server-port=21690 2024-06-11T18:30:11.093719452+08:00 stdout F 2024-06-11 10:30:11,093|alice|INFO|secretflow|entry.py:start_ray:80| 2024-06-11 10:30:07,455 INFO usage_lib.py:423 -- Usage stats collection is disabled. 2024-06-11T18:30:11.093772301+08:00 stdout F 2024-06-11 10:30:07,455 INFO scripts.py:744 -- Local node IP: mgey-jjhodnxr-node-35-0-global.alice.svc 2024-06-11T18:30:11.093789497+08:00 stdout F 2024-06-11 10:30:10,781 SUCC scripts.py:781 -- -------------------- 2024-06-11T18:30:11.093804098+08:00 stdout F 2024-06-11 10:30:10,781 SUCC scripts.py:782 -- Ray runtime started. 2024-06-11T18:30:11.093818148+08:00 stdout F 2024-06-11 10:30:10,781 SUCC scripts.py:783 -- -------------------- 2024-06-11T18:30:11.093832778+08:00 stdout F 2024-06-11 10:30:10,781 INFO scripts.py:785 -- Next steps 2024-06-11T18:30:11.093847126+08:00 stdout F 2024-06-11 10:30:10,781 INFO scripts.py:788 -- To add another node to this Ray cluster, run 2024-06-11T18:30:11.093861489+08:00 stdout F 2024-06-11 10:30:10,781 INFO scripts.py:791 -- ray start --address='mgey-jjhodnxr-node-35-0-global.alice.svc:21687' 2024-06-11T18:30:11.093875424+08:00 stdout F 2024-06-11 10:30:10,781 INFO scripts.py:800 -- To connect to this Ray cluster: 2024-06-11T18:30:11.093897427+08:00 stdout F 2024-06-11 10:30:10,782 INFO scripts.py:802 -- import ray 2024-06-11T18:30:11.093912298+08:00 stdout F 2024-06-11 10:30:10,782 INFO scripts.py:803 -- ray.init(_node_ip_address='mgey-jjhodnxr-node-35-0-global.alice.svc') 2024-06-11T18:30:11.093926778+08:00 stdout F 2024-06-11 10:30:10,782 INFO scripts.py:834 -- To terminate the Ray runtime, run 2024-06-11T18:30:11.093940778+08:00 stdout F 2024-06-11 10:30:10,782 INFO scripts.py:835 -- ray stop 2024-06-11T18:30:11.093954762+08:00 stdout F 2024-06-11 10:30:10,782 INFO scripts.py:838 -- To view the status of the cluster, use 2024-06-11T18:30:11.093968842+08:00 stdout F 2024-06-11 10:30:10,782 INFO scripts.py:839 -- ray status 2024-06-11T18:30:11.093982077+08:00 stdout F 2024-06-11T18:30:11.094037591+08:00 stdout F 2024-06-11 10:30:11,093|alice|INFO|secretflow|entry.py:start_ray:81| Succeeded to start ray head node at mgey-jjhodnxr-node-35-0-global.alice.svc. 2024-06-11T18:30:11.095507431+08:00 stdout F 2024-06-11 10:30:11,095|alice|INFO|secretflow|entry.py:main:510| datasource.access_directly True 2024-06-11T18:30:11.095543235+08:00 stdout F sf_node_eval_param { 2024-06-11T18:30:11.09555942+08:00 stdout F "domain": "data_prep", 2024-06-11T18:30:11.09557397+08:00 stdout F "name": "psi", 2024-06-11T18:30:11.095587977+08:00 stdout F "version": "0.0.5", 2024-06-11T18:30:11.095601854+08:00 stdout F "attrPaths": [ 2024-06-11T18:30:11.095616687+08:00 stdout F "input/receiver_input/key", 2024-06-11T18:30:11.095631122+08:00 stdout F "input/sender_input/key", 2024-06-11T18:30:11.095645052+08:00 stdout F "protocol", 2024-06-11T18:30:11.095658946+08:00 stdout F "sort_result", 2024-06-11T18:30:11.095672656+08:00 stdout F "allow_duplicate_keys", 2024-06-11T18:30:11.095687033+08:00 stdout F "allow_duplicate_keys/no/skip_duplicates_check", 2024-06-11T18:30:11.095740815+08:00 stdout F "fill_value_int", 2024-06-11T18:30:11.095756731+08:00 stdout F "ecdh_curve" 2024-06-11T18:30:11.095770846+08:00 stdout F ], 2024-06-11T18:30:11.095785259+08:00 stdout F "attrs": [ 2024-06-11T18:30:11.095799456+08:00 stdout F { 2024-06-11T18:30:11.095813277+08:00 stdout F "ss": [ 2024-06-11T18:30:11.095827333+08:00 stdout F "id" 2024-06-11T18:30:11.095841514+08:00 stdout F ] 2024-06-11T18:30:11.095855494+08:00 stdout F }, 2024-06-11T18:30:11.095869291+08:00 stdout F { 2024-06-11T18:30:11.095883025+08:00 stdout F "ss": [ 2024-06-11T18:30:11.095896825+08:00 stdout F "id2" 2024-06-11T18:30:11.095910689+08:00 stdout F ] 2024-06-11T18:30:11.095924956+08:00 stdout F }, 2024-06-11T18:30:11.095938756+08:00 stdout F { 2024-06-11T18:30:11.095952634+08:00 stdout F "s": "PROTOCOL_KKRT" 2024-06-11T18:30:11.095966497+08:00 stdout F }, 2024-06-11T18:30:11.095980224+08:00 stdout F { 2024-06-11T18:30:11.095993992+08:00 stdout F "b": true 2024-06-11T18:30:11.096007948+08:00 stdout F }, 2024-06-11T18:30:11.096021709+08:00 stdout F { 2024-06-11T18:30:11.096035559+08:00 stdout F "s": "no" 2024-06-11T18:30:11.096049353+08:00 stdout F }, 2024-06-11T18:30:11.096063047+08:00 stdout F { 2024-06-11T18:30:11.096132769+08:00 stdout F "b": true 2024-06-11T18:30:11.096153762+08:00 stdout F }, 2024-06-11T18:30:11.096168142+08:00 stdout F { 2024-06-11T18:30:11.096182567+08:00 stdout F "isNa": true 2024-06-11T18:30:11.096196557+08:00 stdout F }, 2024-06-11T18:30:11.096210434+08:00 stdout F { 2024-06-11T18:30:11.096224394+08:00 stdout F "s": "CURVE_SM2" 2024-06-11T18:30:11.096238321+08:00 stdout F } 2024-06-11T18:30:11.096252315+08:00 stdout F ], 2024-06-11T18:30:11.096266295+08:00 stdout F "inputs": [ 2024-06-11T18:30:11.096280286+08:00 stdout F { 2024-06-11T18:30:11.096295053+08:00 stdout F "type": "sf.table.individual", 2024-06-11T18:30:11.09630927+08:00 stdout F "meta": { 2024-06-11T18:30:11.096344121+08:00 stdout F "@type": "type.googleapis.com/secretflow.spec.v1.IndividualTable", 2024-06-11T18:30:11.096360585+08:00 stdout F "lineCount": "-1" 2024-06-11T18:30:11.096374712+08:00 stdout F }, 2024-06-11T18:30:11.096388825+08:00 stdout F "dataRefs": [ 2024-06-11T18:30:11.096402703+08:00 stdout F { 2024-06-11T18:30:11.096416819+08:00 stdout F "uri": "breast_new2_590923962.csv", 2024-06-11T18:30:11.09643096+08:00 stdout F "party": "alice", 2024-06-11T18:30:11.096445017+08:00 stdout F "format": "csv" 2024-06-11T18:30:11.096464618+08:00 stdout F } 2024-06-11T18:30:11.096478991+08:00 stdout F ] 2024-06-11T18:30:11.096492855+08:00 stdout F }, 2024-06-11T18:30:11.096506506+08:00 stdout F { 2024-06-11T18:30:11.096520246+08:00 stdout F "type": "sf.table.individual", 2024-06-11T18:30:11.096533942+08:00 stdout F "meta": { 2024-06-11T18:30:11.096547953+08:00 stdout F "@type": "type.googleapis.com/secretflow.spec.v1.IndividualTable", 2024-06-11T18:30:11.09656181+08:00 stdout F "lineCount": "-1" 2024-06-11T18:30:11.096575401+08:00 stdout F }, 2024-06-11T18:30:11.096589181+08:00 stdout F "dataRefs": [ 2024-06-11T18:30:11.096602788+08:00 stdout F { 2024-06-11T18:30:11.096616505+08:00 stdout F "uri": "breast_new1_1450367590.csv", 2024-06-11T18:30:11.096630195+08:00 stdout F "party": "bob", 2024-06-11T18:30:11.096643789+08:00 stdout F "format": "csv" 2024-06-11T18:30:11.096657463+08:00 stdout F } 2024-06-11T18:30:11.09667103+08:00 stdout F ] 2024-06-11T18:30:11.096684704+08:00 stdout F } 2024-06-11T18:30:11.096698327+08:00 stdout F ], 2024-06-11T18:30:11.096712034+08:00 stdout F "checkpointUri": "ckmgey-jjhodnxr-node-35-output-0" 2024-06-11T18:30:11.096725882+08:00 stdout F } 2024-06-11T18:30:11.110466395+08:00 stdout F 2024-06-11 10:30:11,109|alice|WARNING|secretflow|meta_conversion.py:convert_domain_data_to_individual_table:29| kuscia adapter has to deduce dist data from domain data at this moment. 2024-06-11T18:30:11.110794077+08:00 stdout F 2024-06-11 10:30:11,110|alice|INFO|secretflow|entry.py:domaindata_id_to_dist_data:160| domaindata_id woerffhn to 2024-06-11T18:30:11.110823888+08:00 stdout F ........... 2024-06-11T18:30:11.110840301+08:00 stdout F name: "breast_new2" 2024-06-11T18:30:11.110855542+08:00 stdout F type: "sf.table.individual" 2024-06-11T18:30:11.110869856+08:00 stdout F meta { 2024-06-11T18:30:11.110884999+08:00 stdout F type_url: "type.googleapis.com/secretflow.spec.v1.IndividualTable" 2024-06-11T18:30:11.110900047+08:00 stdout F value: "\n\221\003\022\002id\022\021compactness-error\022\017concavity-error\022\024concave-points-error\022\016symmetry-error\022\027fractal-dimension-error\022\014worst-radius\022\rworst-texture\022\017worst-perimeter\022\nworst-area\022\020worst-smoothness\022\021worst-compactness\022\017worst-concavity\022\024worst-concave-points\022\016worst-symmetry\022\027worst-fractal-dimension\022\006target*\003int*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\003int\020\377\377\377\377\377\377\377\377\377\001" 2024-06-11T18:30:11.110915093+08:00 stdout F } 2024-06-11T18:30:11.110929138+08:00 stdout F data_refs { 2024-06-11T18:30:11.110943558+08:00 stdout F uri: "breast_new2_590923962.csv" 2024-06-11T18:30:11.111130776+08:00 stdout F party: "alice" 2024-06-11T18:30:11.11114993+08:00 stdout F format: "csv" 2024-06-11T18:30:11.11116481+08:00 stdout F } 2024-06-11T18:30:11.1111785+08:00 stdout F 2024-06-11T18:30:11.111192368+08:00 stdout F .... 2024-06-11T18:30:11.12376327+08:00 stdout F 2024-06-11 10:30:11,121|alice|WARNING|secretflow|meta_conversion.py:convert_domain_data_to_individual_table:29| kuscia adapter has to deduce dist data from domain data at this moment. 2024-06-11T18:30:11.123801421+08:00 stdout F 2024-06-11 10:30:11,122|alice|INFO|secretflow|entry.py:domaindata_id_to_dist_data:160| domaindata_id vwgwvxul to 2024-06-11T18:30:11.123826399+08:00 stdout F ........... 2024-06-11T18:30:11.12384198+08:00 stdout F name: "breast_new1" 2024-06-11T18:30:11.12385695+08:00 stdout F type: "sf.table.individual" 2024-06-11T18:30:11.123871176+08:00 stdout F meta { 2024-06-11T18:30:11.123886257+08:00 stdout F type_url: "type.googleapis.com/secretflow.spec.v1.IndividualTable" 2024-06-11T18:30:11.123995824+08:00 stdout F value: "\n\344\002\022\003id2\022\013mean-radius\022\014mean-texture\022\016mean-perimeter\022\tmean-area\022\017mean-smoothness\022\020mean-compactness\022\016mean-concavity\022\023mean-concave-points\022\rmean-symmetry\022\026mean-fractal-dimension\022\014radius-error\022\rtexture-error\022\017perimeter-error\022\narea-error\022\020smoothness-error*\003int*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float\020\377\377\377\377\377\377\377\377\377\001" 2024-06-11T18:30:11.124024048+08:00 stdout F } 2024-06-11T18:30:11.124038755+08:00 stdout F data_refs { 2024-06-11T18:30:11.124053461+08:00 stdout F uri: "breast_new1_1450367590.csv" 2024-06-11T18:30:11.124108403+08:00 stdout F party: "bob" 2024-06-11T18:30:11.124130447+08:00 stdout F format: "csv" 2024-06-11T18:30:11.124144911+08:00 stdout F } 2024-06-11T18:30:11.124158441+08:00 stdout F 2024-06-11T18:30:11.124172732+08:00 stdout F .... 2024-06-11T18:30:11.124187219+08:00 stdout F 2024-06-11 10:30:11,122|alice|WARNING|secretflow|entry.py:comp_eval:159| 2024-06-11T18:30:11.124201465+08:00 stdout F -- 2024-06-11T18:30:11.124215623+08:00 stdout F Secretflow 1.6.1b0 2024-06-11T18:30:11.124230503+08:00 stdout F Build time (May 27 2024, 04:48:06) with commit id: eac355f390d9d0d7276ee4cea8d1fe38417cabb6 2024-06-11T18:30:11.124256124+08:00 stdout F -- 2024-06-11T18:30:11.124271447+08:00 stdout F 2024-06-11T18:30:11.124285828+08:00 stdout F 2024-06-11 10:30:11,123|alice|WARNING|secretflow|entry.py:comp_eval:160| 2024-06-11T18:30:11.124330893+08:00 stdout F -- 2024-06-11T18:30:11.124346029+08:00 stdout F *param* 2024-06-11T18:30:11.124359197+08:00 stdout F 2024-06-11T18:30:11.124373157+08:00 stdout F domain: "data_prep" 2024-06-11T18:30:11.124386914+08:00 stdout F name: "psi" 2024-06-11T18:30:11.124400485+08:00 stdout F version: "0.0.5" 2024-06-11T18:30:11.124414201+08:00 stdout F attr_paths: "input/receiver_input/key" 2024-06-11T18:30:11.124427739+08:00 stdout F attr_paths: "input/sender_input/key" 2024-06-11T18:30:11.124441396+08:00 stdout F attr_paths: "protocol" 2024-06-11T18:30:11.124455596+08:00 stdout F attr_paths: "sort_result" 2024-06-11T18:30:11.124469283+08:00 stdout F attr_paths: "allow_duplicate_keys" 2024-06-11T18:30:11.124485853+08:00 stdout F attr_paths: "allow_duplicate_keys/no/skip_duplicates_check" 2024-06-11T18:30:11.124499537+08:00 stdout F attr_paths: "fill_value_int" 2024-06-11T18:30:11.124513034+08:00 stdout F attr_paths: "ecdh_curve" 2024-06-11T18:30:11.124526628+08:00 stdout F attrs { 2024-06-11T18:30:11.124540282+08:00 stdout F ss: "id" 2024-06-11T18:30:11.124553999+08:00 stdout F } 2024-06-11T18:30:11.124567832+08:00 stdout F attrs { 2024-06-11T18:30:11.124581626+08:00 stdout F ss: "id2" 2024-06-11T18:30:11.124595173+08:00 stdout F } 2024-06-11T18:30:11.124608843+08:00 stdout F attrs { 2024-06-11T18:30:11.124622481+08:00 stdout F s: "PROTOCOL_KKRT" 2024-06-11T18:30:11.124636451+08:00 stdout F } 2024-06-11T18:30:11.124650115+08:00 stdout F attrs { 2024-06-11T18:30:11.124663788+08:00 stdout F b: true 2024-06-11T18:30:11.124677328+08:00 stdout F } 2024-06-11T18:30:11.124690889+08:00 stdout F attrs { 2024-06-11T18:30:11.124704846+08:00 stdout F s: "no" 2024-06-11T18:30:11.124718446+08:00 stdout F } 2024-06-11T18:30:11.124732074+08:00 stdout F attrs { 2024-06-11T18:30:11.12474563+08:00 stdout F b: true 2024-06-11T18:30:11.124759185+08:00 stdout F } 2024-06-11T18:30:11.124772851+08:00 stdout F attrs { 2024-06-11T18:30:11.124786391+08:00 stdout F is_na: true 2024-06-11T18:30:11.124799999+08:00 stdout F } 2024-06-11T18:30:11.124813589+08:00 stdout F attrs { 2024-06-11T18:30:11.124827219+08:00 stdout F s: "CURVE_SM2" 2024-06-11T18:30:11.124840967+08:00 stdout F } 2024-06-11T18:30:11.124854613+08:00 stdout F inputs { 2024-06-11T18:30:11.12486821+08:00 stdout F name: "breast_new2" 2024-06-11T18:30:11.124881964+08:00 stdout F type: "sf.table.individual" 2024-06-11T18:30:11.124895571+08:00 stdout F meta { 2024-06-11T18:30:11.124909342+08:00 stdout F type_url: "type.googleapis.com/secretflow.spec.v1.IndividualTable" 2024-06-11T18:30:11.124924079+08:00 stdout F value: "\n\221\003\022\002id\022\021compactness-error\022\017concavity-error\022\024concave-points-error\022\016symmetry-error\022\027fractal-dimension-error\022\014worst-radius\022\rworst-texture\022\017worst-perimeter\022\nworst-area\022\020worst-smoothness\022\021worst-compactness\022\017worst-concavity\022\024worst-concave-points\022\016worst-symmetry\022\027worst-fractal-dimension\022\006target*\003int*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\003int\020\377\377\377\377\377\377\377\377\377\001" 2024-06-11T18:30:11.12494364+08:00 stdout F } 2024-06-11T18:30:11.124958193+08:00 stdout F data_refs { 2024-06-11T18:30:11.12497206+08:00 stdout F uri: "breast_new2_590923962.csv" 2024-06-11T18:30:11.124985714+08:00 stdout F party: "alice" 2024-06-11T18:30:11.124999384+08:00 stdout F format: "csv" 2024-06-11T18:30:11.125012921+08:00 stdout F } 2024-06-11T18:30:11.125026478+08:00 stdout F } 2024-06-11T18:30:11.125039942+08:00 stdout F inputs { 2024-06-11T18:30:11.125053586+08:00 stdout F name: "breast_new1" 2024-06-11T18:30:11.125067219+08:00 stdout F type: "sf.table.individual" 2024-06-11T18:30:11.125080656+08:00 stdout F meta { 2024-06-11T18:30:11.125094307+08:00 stdout F type_url: "type.googleapis.com/secretflow.spec.v1.IndividualTable" 2024-06-11T18:30:11.12778897+08:00 stdout F value: "\n\344\002\022\003id2\022\013mean-radius\022\014mean-texture\022\016mean-perimeter\022\tmean-area\022\017mean-smoothness\022\020mean-compactness\022\016mean-concavity\022\023mean-concave-points\022\rmean-symmetry\022\026mean-fractal-dimension\022\014radius-error\022\rtexture-error\022\017perimeter-error\022\narea-error\022\020smoothness-error*\003int*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float\020\377\377\377\377\377\377\377\377\377\001" 2024-06-11T18:30:11.127823124+08:00 stdout F } 2024-06-11T18:30:11.127845221+08:00 stdout F data_refs { 2024-06-11T18:30:11.127865925+08:00 stdout F uri: "breast_new1_1450367590.csv" 2024-06-11T18:30:11.127886046+08:00 stdout F party: "bob" 2024-06-11T18:30:11.12790702+08:00 stdout F format: "csv" 2024-06-11T18:30:11.127929011+08:00 stdout F } 2024-06-11T18:30:11.127944127+08:00 stdout F } 2024-06-11T18:30:11.127958404+08:00 stdout F output_uris: "mgey-jjhodnxr-node-35-output-0" 2024-06-11T18:30:11.127979035+08:00 stdout F checkpoint_uri: "ckmgey-jjhodnxr-node-35-output-0" 2024-06-11T18:30:11.127992948+08:00 stdout F 2024-06-11T18:30:11.128006906+08:00 stdout F -- 2024-06-11T18:30:11.128020099+08:00 stdout F 2024-06-11T18:30:11.128146517+08:00 stdout F 2024-06-11 10:30:11,123|alice|WARNING|secretflow|entry.py:comp_eval:161| 2024-06-11T18:30:11.128169653+08:00 stdout F -- 2024-06-11T18:30:11.128184444+08:00 stdout F *storage_config* 2024-06-11T18:30:11.128197901+08:00 stdout F 2024-06-11T18:30:11.128211708+08:00 stdout F type: "local_fs" 2024-06-11T18:30:11.128225642+08:00 stdout F local_fs { 2024-06-11T18:30:11.128240272+08:00 stdout F wd: "/home/kuscia/var/storage/data" 2024-06-11T18:30:11.12825415+08:00 stdout F } 2024-06-11T18:30:11.1282672+08:00 stdout F 2024-06-11T18:30:11.12828069+08:00 stdout F -- 2024-06-11T18:30:11.128293804+08:00 stdout F 2024-06-11T18:30:11.128321087+08:00 stdout F 2024-06-11 10:30:11,123|alice|WARNING|secretflow|entry.py:comp_eval:162| 2024-06-11T18:30:11.128335955+08:00 stdout F -- 2024-06-11T18:30:11.128349965+08:00 stdout F *cluster_config* 2024-06-11T18:30:11.128363225+08:00 stdout F 2024-06-11T18:30:11.128376816+08:00 stdout F desc { 2024-06-11T18:30:11.128390503+08:00 stdout F parties: "bob" 2024-06-11T18:30:11.12840444+08:00 stdout F parties: "alice" 2024-06-11T18:30:11.128418364+08:00 stdout F devices { 2024-06-11T18:30:11.128424824+08:00 stderr F 2024-06-11 10:30:11,127 INFO worker.py:1540 -- Connecting to existing Ray cluster at address: mgey-jjhodnxr-node-35-0-global.alice.svc:21687... 2024-06-11T18:30:11.128433564+08:00 stdout F name: "spu" 2024-06-11T18:30:11.128477552+08:00 stdout F type: "spu" 2024-06-11T18:30:11.128493289+08:00 stdout F parties: "bob" 2024-06-11T18:30:11.128507632+08:00 stdout F parties: "alice" 2024-06-11T18:30:11.12853234+08:00 stdout F config: "{\"runtime_config\":{\"protocol\":\"SEMI2K\",\"field\":\"FM128\"},\"link_desc\":{\"connect_retry_times\":60,\"connect_retry_interval_ms\":1000,\"brpc_channel_protocol\":\"http\",\"brpc_channel_connection_type\":\"pooled\",\"recv_timeout_ms\":1200000,\"http_timeout_ms\":1200000}}" 2024-06-11T18:30:11.128548331+08:00 stdout F } 2024-06-11T18:30:11.128562661+08:00 stdout F devices { 2024-06-11T18:30:11.128576771+08:00 stdout F name: "heu" 2024-06-11T18:30:11.128590662+08:00 stdout F type: "heu" 2024-06-11T18:30:11.128604452+08:00 stdout F parties: "bob" 2024-06-11T18:30:11.128618202+08:00 stdout F parties: "alice" 2024-06-11T18:30:11.128633+08:00 stdout F config: "{\"mode\": \"PHEU\", \"schema\": \"paillier\", \"key_size\": 2048}" 2024-06-11T18:30:11.128647093+08:00 stdout F } 2024-06-11T18:30:11.128661557+08:00 stdout F ray_fed_config { 2024-06-11T18:30:11.128675927+08:00 stdout F cross_silo_comm_backend: "brpc_link" 2024-06-11T18:30:11.128689734+08:00 stdout F } 2024-06-11T18:30:11.128703945+08:00 stdout F } 2024-06-11T18:30:11.128717882+08:00 stdout F public_config { 2024-06-11T18:30:11.128760043+08:00 stdout F ray_fed_config { 2024-06-11T18:30:11.128775164+08:00 stdout F parties: "bob" 2024-06-11T18:30:11.12878921+08:00 stdout F parties: "alice" 2024-06-11T18:30:11.128803954+08:00 stdout F addresses: "mgey-jjhodnxr-node-35-0-fed.bob.svc:80" 2024-06-11T18:30:11.128818788+08:00 stdout F addresses: "0.0.0.0:21686" 2024-06-11T18:30:11.128832585+08:00 stdout F } 2024-06-11T18:30:11.128846356+08:00 stdout F spu_configs { 2024-06-11T18:30:11.128860046+08:00 stdout F name: "spu" 2024-06-11T18:30:11.128873699+08:00 stdout F parties: "bob" 2024-06-11T18:30:11.128887167+08:00 stdout F parties: "alice" 2024-06-11T18:30:11.12890074+08:00 stdout F addresses: "http://mgey-jjhodnxr-node-35-0-spu.bob.svc:80" 2024-06-11T18:30:11.12891451+08:00 stdout F addresses: "0.0.0.0:21691" 2024-06-11T18:30:11.128928304+08:00 stdout F } 2024-06-11T18:30:11.128941948+08:00 stdout F } 2024-06-11T18:30:11.128955885+08:00 stdout F private_config { 2024-06-11T18:30:11.128969702+08:00 stdout F self_party: "alice" 2024-06-11T18:30:11.128983429+08:00 stdout F ray_head_addr: "mgey-jjhodnxr-node-35-0-global.alice.svc:21687" 2024-06-11T18:30:11.128997166+08:00 stdout F } 2024-06-11T18:30:11.129010216+08:00 stdout F 2024-06-11T18:30:11.12902407+08:00 stdout F -- 2024-06-11T18:30:11.129037027+08:00 stdout F 2024-06-11T18:30:11.129054657+08:00 stdout F 2024-06-11 10:30:11,126|alice|WARNING|secretflow|driver.py:init:442| When connecting to an existing cluster, num_cpus must not be provided. Num_cpus is neglected at this moment. 2024-06-11T18:30:11.145467061+08:00 stdout F 2024-06-11 10:30:11,144|alice|DEBUG|secretflow|_api.py:acquire:294| Attempting to acquire lock 139795041269792 on /tmp/ray/session_2024-06-11_10-30-07_456938_49/node_ip_address.json.lock 2024-06-11T18:30:11.146947922+08:00 stdout F 2024-06-11 10:30:11,145|alice|DEBUG|secretflow|_api.py:acquire:297| Lock 139795041269792 acquired on /tmp/ray/session_2024-06-11_10-30-07_456938_49/node_ip_address.json.lock 2024-06-11T18:30:11.146975089+08:00 stdout F 2024-06-11 10:30:11,145|alice|DEBUG|secretflow|_api.py:release:327| Attempting to release lock 139795041269792 on /tmp/ray/session_2024-06-11_10-30-07_456938_49/node_ip_address.json.lock 2024-06-11T18:30:11.14699062+08:00 stdout F 2024-06-11 10:30:11,146|alice|DEBUG|secretflow|_api.py:release:330| Lock 139795041269792 released on /tmp/ray/session_2024-06-11_10-30-07_456938_49/node_ip_address.json.lock 2024-06-11T18:30:11.152731876+08:00 stdout F 2024-06-11 10:30:11,152|alice|DEBUG|secretflow|_api.py:acquire:294| Attempting to acquire lock 139795041269840 on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock 2024-06-11T18:30:11.153204129+08:00 stdout F 2024-06-11 10:30:11,152|alice|DEBUG|secretflow|_api.py:acquire:297| Lock 139795041269840 acquired on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock 2024-06-11T18:30:11.153602909+08:00 stdout F 2024-06-11 10:30:11,153|alice|DEBUG|secretflow|_api.py:release:327| Attempting to release lock 139795041269840 on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock 2024-06-11T18:30:11.153815218+08:00 stdout F 2024-06-11 10:30:11,153|alice|DEBUG|secretflow|_api.py:release:330| Lock 139795041269840 released on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock 2024-06-11T18:30:11.154064465+08:00 stdout F 2024-06-11 10:30:11,153|alice|DEBUG|secretflow|_api.py:acquire:294| Attempting to acquire lock 139795041269744 on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock 2024-06-11T18:30:11.154569722+08:00 stdout F 2024-06-11 10:30:11,154|alice|DEBUG|secretflow|_api.py:acquire:297| Lock 139795041269744 acquired on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock 2024-06-11T18:30:11.154926231+08:00 stdout F 2024-06-11 10:30:11,154|alice|DEBUG|secretflow|_api.py:release:327| Attempting to release lock 139795041269744 on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock 2024-06-11T18:30:11.155114487+08:00 stdout F 2024-06-11 10:30:11,154|alice|DEBUG|secretflow|_api.py:release:330| Lock 139795041269744 released on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock 2024-06-11T18:30:11.155461816+08:00 stdout F 2024-06-11 10:30:11,155|alice|DEBUG|secretflow|_api.py:acquire:294| Attempting to acquire lock 139795041269936 on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock 2024-06-11T18:30:11.155847743+08:00 stdout F 2024-06-11 10:30:11,155|alice|DEBUG|secretflow|_api.py:acquire:297| Lock 139795041269936 acquired on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock 2024-06-11T18:30:11.156335524+08:00 stdout F 2024-06-11 10:30:11,155|alice|DEBUG|secretflow|_api.py:release:327| Attempting to release lock 139795041269936 on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock 2024-06-11T18:30:11.156393651+08:00 stdout F 2024-06-11 10:30:11,156|alice|DEBUG|secretflow|_api.py:release:330| Lock 139795041269936 released on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock 2024-06-11T18:30:11.156742928+08:00 stdout F 2024-06-11 10:30:11,156|alice|DEBUG|secretflow|_api.py:acquire:294| Attempting to acquire lock 139795041269792 on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock 2024-06-11T18:30:11.158334868+08:00 stderr F 2024-06-11 10:30:11,157 INFO worker.py:1724 -- Connected to Ray cluster. 2024-06-11T18:30:11.158397393+08:00 stdout F 2024-06-11 10:30:11,156|alice|DEBUG|secretflow|_api.py:acquire:297| Lock 139795041269792 acquired on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock 2024-06-11T18:30:11.158415243+08:00 stdout F 2024-06-11 10:30:11,157|alice|DEBUG|secretflow|_api.py:release:327| Attempting to release lock 139795041269792 on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock 2024-06-11T18:30:11.158429444+08:00 stdout F 2024-06-11 10:30:11,157|alice|DEBUG|secretflow|_api.py:release:330| Lock 139795041269792 released on /tmp/ray/session_2024-06-11_10-30-07_456938_49/ports_by_node.json.lock 2024-06-11T18:30:13.009225838+08:00 stderr F 2024-06-11 10:30:13.008 INFO api.py:233 [alice] -- [Anonymous_job] Started rayfed with {'CLUSTER_ADDRESSES': {'bob': 'http://mgey-jjhodnxr-node-35-0-fed.bob.svc:80', 'alice': '0.0.0.0:21686'}, 'CURRENT_PARTY_NAME': 'alice', 'TLS_CONFIG': {}} 2024-06-11T18:30:13.855544191+08:00 stderr F (raylet) [2024-06-11 10:30:13,810 I 660 660] logging.cc:230: Set ray log level from environment variable RAY_BACKEND_LOG_LEVEL to -1 2024-06-11T18:30:15.193184363+08:00 stderr F (SenderReceiverProxyActor pid=660) 2024-06-11 10:30:15.166 INFO link.py:38 [alice] -- [Anonymous_job] brpc options: {'proxy_max_restarts': 3, 'timeout_in_ms': 300000, 'recv_timeout_ms': 604800000, 'connect_retry_times': 3600, 'connect_retry_interval_ms': 1000, 'brpc_channel_protocol': 'http', 'brpc_channel_connection_type': 'pooled', 'exit_on_sending_failure': True} 2024-06-11T18:30:15.193240596+08:00 stderr F (SenderReceiverProxyActor pid=660) I0611 10:30:15.189406 660 external/com_github_brpc_brpc/src/brpc/server.cpp:1181] Server[yacl::link::transport::internal::ReceiverServiceImpl] is serving on port=21686. 2024-06-11T18:30:15.193254738+08:00 stderr F (SenderReceiverProxyActor pid=660) W0611 10:30:15.189486 660 external/com_github_brpc_brpc/src/brpc/server.cpp:1187] Builtin services are disabled according to ServerOptions.has_builtin_services 2024-06-11T18:30:15.617776759+08:00 stderr F (SenderReceiverProxyActor pid=660) I0611 10:30:15.543109 726 external/com_github_brpc_brpc/src/brpc/span.cpp:506] Opened ./rpc_data/rpcz/20240611.103015.660/id.db and ./rpc_data/rpcz/20240611.103015.660/time.db 2024-06-11T18:30:16.494040748+08:00 stderr F 2024-06-11 10:30:16.493 INFO barriers.py:465 [alice] -- [Anonymous_job] Succeeded to create receiver proxy actor. 2024-06-11T18:30:16.49435982+08:00 stderr F 2024-06-11 10:30:16.493 INFO barriers.py:520 [alice] -- [Anonymous_job] Try ping ['bob'] at 0 attemp, up to 3600 attemps. 2024-06-11T18:30:16.509139589+08:00 stderr F 2024-06-11 10:30:16.508 WARNING psi.py:358 [alice] -- [Anonymous_job] {'cluster_def': {'nodes': [{'party': 'bob', 'address': 'http://mgey-jjhodnxr-node-35-0-spu.bob.svc:80', 'listen_address': ''}, {'party': 'alice', 'address': '0.0.0.0:21691', 'listen_address': ''}], 'runtime_config': {'protocol': 2, 'field': 3}}, 'link_desc': {'connect_retry_times': 60, 'connect_retry_interval_ms': 1000, 'brpc_channel_protocol': 'http', 'brpc_channel_connection_type': 'pooled', 'recv_timeout_ms': 1200000, 'http_timeout_ms': 1200000}} 2024-06-11T18:30:21.438985122+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) [354.628] perfetto.cc:45899 Configured tracing session 1, #sources:1, duration:0 ms, #buffers:1, total buffer size:1024 KB, total sessions:1, uid:0 session name: "" 2024-06-11T18:30:21.439034746+08:00 stderr F (raylet) [2024-06-11 10:30:17,901 I 728 728] logging.cc:230: Set ray log level from environment variable RAY_BACKEND_LOG_LEVEL to -1 2024-06-11T18:30:21.553161584+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) *** SIGILL received at time=1718101821 on cpu 3 *** 2024-06-11T18:30:21.553204936+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) PC: @ 0x7f2373fdfa32 (unknown) yacl::AvxTranspose128() 2024-06-11T18:30:21.553220273+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) @ 0x7f25dc012ce0 (unknown) (unknown) 2024-06-11T18:30:21.553266017+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) @ 0x7f2373f399ad 9536 yacl::crypto::IknpOtExtSend() 2024-06-11T18:30:21.666765888+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) @ 0x7f2373f34029 464 psi::kkrt::GetKkrtOtReceiverOptions() 2024-06-11T18:30:21.666827943+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) @ 0x7f2373d60ab0 1536 psi::RunPsi() 2024-06-11T18:30:21.666851651+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) @ 0x7f2373d56205 384 psi::BindLibs()::{lambda()#3}::operator()() 2024-06-11T18:30:21.666873831+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) @ 0x7f2373d56483 176 pybind11::cpp_function::initialize<>()::{lambda()#3}::_FUN() 2024-06-11T18:30:21.666925956+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) @ 0x7f2373d3843d 736 pybind11::cpp_function::dispatcher() 2024-06-11T18:30:21.66694947+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) @ 0x4fc697 (unknown) cfunction_call 2024-06-11T18:30:21.666971214+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) @ ... and at least 1 more frames 2024-06-11T18:30:21.666991882+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) [2024-06-11 10:30:21,644 E 728 728] logging.cc:361: *** SIGILL received at time=1718101821 on cpu 3 *** 2024-06-11T18:30:21.667012095+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) [2024-06-11 10:30:21,644 E 728 728] logging.cc:361: PC: @ 0x7f2373fdfa32 (unknown) yacl::AvxTranspose128() 2024-06-11T18:30:21.667052926+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) [2024-06-11 10:30:21,644 E 728 728] logging.cc:361: @ 0x7f25dc012ce0 (unknown) (unknown) 2024-06-11T18:30:21.667076384+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) [2024-06-11 10:30:21,645 E 728 728] logging.cc:361: @ 0x7f2373f399ad 9536 yacl::crypto::IknpOtExtSend() 2024-06-11T18:30:21.66709696+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) [2024-06-11 10:30:21,645 E 728 728] logging.cc:361: @ 0x7f2373f34029 464 psi::kkrt::GetKkrtOtReceiverOptions() 2024-06-11T18:30:21.667151046+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) [2024-06-11 10:30:21,645 E 728 728] logging.cc:361: @ 0x7f2373d60ab0 1536 psi::RunPsi() 2024-06-11T18:30:21.667214121+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) [2024-06-11 10:30:21,645 E 728 728] logging.cc:361: @ 0x7f2373d56205 384 psi::BindLibs()::{lambda()#3}::operator()() 2024-06-11T18:30:21.667341457+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) [2024-06-11 10:30:21,645 E 728 728] logging.cc:361: @ 0x7f2373d56483 176 pybind11::cpp_function::initialize<>()::{lambda()#3}::_FUN() 2024-06-11T18:30:21.667410986+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) [2024-06-11 10:30:21,645 E 728 728] logging.cc:361: @ 0x7f2373d3843d 736 pybind11::cpp_function::dispatcher() 2024-06-11T18:30:21.66743628+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) [2024-06-11 10:30:21,645 E 728 728] logging.cc:361: @ 0x4fc697 (unknown) cfunction_call 2024-06-11T18:30:21.667457034+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) [2024-06-11 10:30:21,645 E 728 728] logging.cc:361: @ ... and at least 1 more frames 2024-06-11T18:30:21.667478061+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) Fatal Python error: Illegal instruction 2024-06-11T18:30:21.667499926+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) 2024-06-11T18:30:21.667520369+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) Stack (most recent call first): 2024-06-11T18:30:21.667573281+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) File "/usr/local/lib/python3.10/site-packages/spu/psi.py", line 118 in psi 2024-06-11T18:30:21.667596458+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) File "/usr/local/lib/python3.10/site-packages/secretflow/device/device/spu.py", line 1379 in psi 2024-06-11T18:30:21.667616909+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) File "/usr/local/lib/python3.10/site-packages/ray/util/tracing/tracing_helper.py", line 467 in _resume_span 2024-06-11T18:30:21.667636415+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) File "/usr/local/lib/python3.10/site-packages/ray/_private/function_manager.py", line 726 in actor_method_executor 2024-06-11T18:30:21.667676736+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) File "/usr/local/lib/python3.10/site-packages/ray/_private/worker.py", line 847 in main_loop 2024-06-11T18:30:21.667698507+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) File "/usr/local/lib/python3.10/site-packages/ray/_private/workers/default_worker.py", line 282 in 2024-06-11T18:30:21.667746345+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) 2024-06-11T18:30:21.668545347+08:00 stderr F (SPURuntime(device_id=None, party=alice) pid=728) Extension modules: msgpack._cmsgpack, google._upb._message, psutil._psutil_linux, psutil._psutil_posix, setproctitle, yaml._yaml, charset_normalizer.md, requests.packages.charset_normalizer.md, requests.packages.chardet.md, ray._raylet, numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, jaxlib.cpu_feature_guard, grpc._cython.cygrpc, pandas._libs.tslibs.np_datetime, pandas._libs.tslibs.dtypes, pandas._libs.tslibs.base, pandas._libs.tslibs.nattype, pandas._libs.tslibs.timezones, pandas._libs.tslibs.tzconversion, pandas._libs.tslibs.ccalendar, pandas._libs.tslibs.fields, pandas._libs.tslibs.timedeltas, pandas._libs.tslibs.timestamps, pandas._libs.properties, pandas._libs.tslibs.offsets, pandas._libs.tslibs.parsing, pandas._libs.tslibs.conversion, pandas._libs.tslibs.period, pandas._libs.tslibs.vectorized, pandas._libs.ops_dispatch, pandas._libs.missing, pandas._libs.hashtable, pandas._libs.algos, pandas._libs.interval, pandas._libs.tslib, pandas._libs.lib, pandas._libs.hashing, pyarrow.lib, pyarrow._hdfsio, pandas._libs.ops, pyarrow._compute, pandas._libs.arrays, pandas._libs.index, pandas._libs.join, pandas._libs.sparse, pandas._libs.reduction, pandas._libs.indexing, pandas._libs.internals, pandas._libs.writers, pandas._libs.window.aggregations, pandas._libs.window.indexers, pandas._libs.reshape, pandas._libs.tslibs.strptime, pandas._libs.groupby, pandas._libs.testing, pandas._libs.parsers, pandas._libs.json, scipy._lib._ccallback_c, scipy.sparse._sparsetools, _csparsetools, scipy.sparse._csparsetools, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg.cython_lapack, scipy.linalg._cythonized_array_utils, scipy.linalg._solve_toeplitz, scipy.linalg._decomp_lu_cython, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_blas, scipy.linalg._matfuncs_expm, scipy.linalg._decomp_update, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.sparse.linalg._propack._spropack, scipy.sparse.linalg._propack._dpropack, scipy.sparse.linalg._propack._cpropack, scipy.sparse.linalg._propack._zpropack, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, sklearn.__check_build._check_build, scipy.special._ufuncs_cxx, scipy.special._cdflib, scipy.special._ufuncs, scipy.special._specfun, scipy.special._comb, scipy.special._ellip_harm_2, scipy.spatial._ckdtree, scipy._lib.messagestream, scipy.spatial._qhull, scipy.spatial._voronoi, scipy.spatial._distance_wrap, scipy.spatial._hausdorff, scipy.spatial.transform._rotation, scipy.ndimage._nd_image, _ni_label, scipy.ndimage._ni_label, scipy.optimize._minpack2, scipy.optimize._group_columns, scipy.optimize._trlib._trlib, scipy.optimize._lbfgsb, _moduleTNC, scipy.optimize._moduleTNC, scipy.optimize._cobyla, scipy.optimize._slsqp, scipy.optimize._minpack, scipy.optimize._lsq.givens_elimination, scipy.optimize._zeros, scipy.optimize._highs.cython.src._highs_wrapper, scipy.optimize._highs._highs_wrapper, scipy.optimize._highs.cython.src._highs_constants, scipy.optimize._highs._highs_constants, scipy.linalg._interpolative, scipy.optimize._bglu_dense, scipy.optimize._lsap, scipy.optimize._direct, scipy.integrate._odepack, scipy.integrate._quadpack, scipy.integrate._vode, scipy.integrate._dop, scipy.integrate._lsoda, scipy.special.cython_special, scipy.stats._stats, scipy.stats.beta_ufunc, scipy.stats._boost.beta_ufunc, scipy.stats.binom_ufunc, scipy.stats._boost.binom_ufunc, scipy.stats.nbinom_ufunc, scipy.stats._boost.nbinom_ufunc, scipy.stats.hypergeom_ufunc, scipy.stats._boost.hypergeom_ufunc, scipy.stats.ncf_ufunc, scipy.stats._boost.ncf_ufunc, scipy.stats.ncx2_ufunc, scipy.stats._boost.ncx2_ufunc, scipy.stats.nct_ufunc, scipy.stats._boost.nct_ufunc, scipy.stats.skewnorm_ufunc, scipy.stats._boost.skewnorm_ufunc, scipy.stats.invgauss_ufunc, scipy.stats._boost.invgauss_ufunc, scipy.interpolate._fitpack, scipy.interpolate.dfitpack, scipy.interpolate._bspl, scipy.interpolate._ppoly, scipy.interpolate.interpnd, scipy.interpolate._rbfinterp_pythran, scipy.interpolate._rgi_cython, scipy.stats._biasedurn, scipy.stats._levy_stable.levyst, scipy.stats._stats_pythran, scipy._lib._uarray._uarray, scipy.stats._ansari_swilk_statistics, scipy.stats._sobol, scipy.stats._qmc_cy, scipy.stats._mvn, scipy.stats._rcont.rcont, scipy.stats._unuran.unuran_wrapper, sklearn.utils._isfinite, sklearn.utils.murmurhash, sklearn.utils._openmp_helpers, sklearn.utils._logistic_sigmoid, sklearn.utils.sparsefuncs_fast, sklearn.preprocessing._csr_polynomial_expansion, sklearn.preprocessing._target_encoder_fast, pyarrow._json (total: 182) 2024-06-11T18:30:22.373640364+08:00 stderr F 2024-06-11 10:30:22.373 ERROR component.py:1129 [alice] -- [Anonymous_job] eval on domain: "data_prep" 2024-06-11T18:30:22.373696425+08:00 stderr F name: "psi" 2024-06-11T18:30:22.373709411+08:00 stderr F version: "0.0.5" 2024-06-11T18:30:22.373722246+08:00 stderr F attr_paths: "input/receiver_input/key" 2024-06-11T18:30:22.373733551+08:00 stderr F attr_paths: "input/sender_input/key" 2024-06-11T18:30:22.373745182+08:00 stderr F attr_paths: "protocol" 2024-06-11T18:30:22.373757157+08:00 stderr F attr_paths: "sort_result" 2024-06-11T18:30:22.37376828+08:00 stderr F attr_paths: "allow_duplicate_keys" 2024-06-11T18:30:22.373780078+08:00 stderr F attr_paths: "allow_duplicate_keys/no/skip_duplicates_check" 2024-06-11T18:30:22.373791173+08:00 stderr F attr_paths: "fill_value_int" 2024-06-11T18:30:22.373802246+08:00 stderr F attr_paths: "ecdh_curve" 2024-06-11T18:30:22.373813534+08:00 stderr F attrs { 2024-06-11T18:30:22.373824934+08:00 stderr F ss: "id" 2024-06-11T18:30:22.373836034+08:00 stderr F } 2024-06-11T18:30:22.373847157+08:00 stderr F attrs { 2024-06-11T18:30:22.373858282+08:00 stderr F ss: "id2" 2024-06-11T18:30:22.37386938+08:00 stderr F } 2024-06-11T18:30:22.373880443+08:00 stderr F attrs { 2024-06-11T18:30:22.373891623+08:00 stderr F s: "PROTOCOL_KKRT" 2024-06-11T18:30:22.373902694+08:00 stderr F } 2024-06-11T18:30:22.373914471+08:00 stderr F attrs { 2024-06-11T18:30:22.373925667+08:00 stderr F b: true 2024-06-11T18:30:22.373936799+08:00 stderr F } 2024-06-11T18:30:22.373947895+08:00 stderr F attrs { 2024-06-11T18:30:22.373958935+08:00 stderr F s: "no" 2024-06-11T18:30:22.373969935+08:00 stderr F } 2024-06-11T18:30:22.373981016+08:00 stderr F attrs { 2024-06-11T18:30:22.373992153+08:00 stderr F b: true 2024-06-11T18:30:22.374003089+08:00 stderr F } 2024-06-11T18:30:22.374014054+08:00 stderr F attrs { 2024-06-11T18:30:22.374025109+08:00 stderr F is_na: true 2024-06-11T18:30:22.37405176+08:00 stderr F } 2024-06-11T18:30:22.374064425+08:00 stderr F attrs { 2024-06-11T18:30:22.374075818+08:00 stderr F s: "CURVE_SM2" 2024-06-11T18:30:22.374086949+08:00 stderr F } 2024-06-11T18:30:22.374098264+08:00 stderr F inputs { 2024-06-11T18:30:22.374109667+08:00 stderr F name: "breast_new2" 2024-06-11T18:30:22.374142928+08:00 stderr F type: "sf.table.individual" 2024-06-11T18:30:22.374155418+08:00 stderr F meta { 2024-06-11T18:30:22.374167288+08:00 stderr F type_url: "type.googleapis.com/secretflow.spec.v1.IndividualTable" 2024-06-11T18:30:22.374179791+08:00 stderr F value: "\n\221\003\022\002id\022\021compactness-error\022\017concavity-error\022\024concave-points-error\022\016symmetry-error\022\027fractal-dimension-error\022\014worst-radius\022\rworst-texture\022\017worst-perimeter\022\nworst-area\022\020worst-smoothness\022\021worst-compactness\022\017worst-concavity\022\024worst-concave-points\022\016worst-symmetry\022\027worst-fractal-dimension\022\006target*\003int*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\003int\020\377\377\377\377\377\377\377\377\377\001" 2024-06-11T18:30:22.374191799+08:00 stderr F } 2024-06-11T18:30:22.374202999+08:00 stderr F data_refs { 2024-06-11T18:30:22.374214144+08:00 stderr F uri: "breast_new2_590923962.csv" 2024-06-11T18:30:22.374225187+08:00 stderr F party: "alice" 2024-06-11T18:30:22.374328268+08:00 stderr F format: "csv" 2024-06-11T18:30:22.374349533+08:00 stderr F } 2024-06-11T18:30:22.374361031+08:00 stderr F } 2024-06-11T18:30:22.374372016+08:00 stderr F inputs { 2024-06-11T18:30:22.374383092+08:00 stderr F name: "breast_new1" 2024-06-11T18:30:22.374394424+08:00 stderr F type: "sf.table.individual" 2024-06-11T18:30:22.374405627+08:00 stderr F meta { 2024-06-11T18:30:22.374417065+08:00 stderr F type_url: "type.googleapis.com/secretflow.spec.v1.IndividualTable" 2024-06-11T18:30:22.374478427+08:00 stderr F value: "\n\344\002\022\003id2\022\013mean-radius\022\014mean-texture\022\016mean-perimeter\022\tmean-area\022\017mean-smoothness\022\020mean-compactness\022\016mean-concavity\022\023mean-concave-points\022\rmean-symmetry\022\026mean-fractal-dimension\022\014radius-error\022\rtexture-error\022\017perimeter-error\022\narea-error\022\020smoothness-error*\003int*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float*\005float\020\377\377\377\377\377\377\377\377\377\001" 2024-06-11T18:30:22.374510683+08:00 stderr F } 2024-06-11T18:30:22.374522973+08:00 stderr F data_refs { 2024-06-11T18:30:22.374534471+08:00 stderr F uri: "breast_new1_1450367590.csv" 2024-06-11T18:30:22.374814428+08:00 stderr F party: "bob" 2024-06-11T18:30:22.374832389+08:00 stderr F format: "csv" 2024-06-11T18:30:22.374844124+08:00 stderr F } 2024-06-11T18:30:22.374855364+08:00 stderr F } 2024-06-11T18:30:22.37486733+08:00 stderr F output_uris: "mgey-jjhodnxr-node-35-output-0" 2024-06-11T18:30:22.374879055+08:00 stderr F checkpoint_uri: "ckmgey-jjhodnxr-node-35-output-0" 2024-06-11T18:30:22.374890605+08:00 stderr F failed, error 2024-06-11T18:30:22.375034672+08:00 stderr F 2024-06-11 10:30:22.373 INFO api.py:342 [alice] -- [Anonymous_job] Shutdowning rayfed intendedly... 2024-06-11T18:30:22.37504974+08:00 stderr F 2024-06-11 10:30:22.373 INFO api.py:356 [alice] -- [Anonymous_job] No wait for data sending. 2024-06-11T18:30:22.376752291+08:00 stderr F 2024-06-11 10:30:22.376 INFO message_queue.py:72 [alice] -- [Anonymous_job] Notify message polling thread[DataSendingQueueThread] to exit. 2024-06-11T18:30:22.376974735+08:00 stderr F 2024-06-11 10:30:22.376 INFO message_queue.py:72 [alice] -- [Anonymous_job] Notify message polling thread[ErrorSendingQueueThread] to exit. 2024-06-11T18:30:22.3770069+08:00 stderr F 2024-06-11 10:30:22.376 INFO api.py:384 [alice] -- [Anonymous_job] Shutdowned rayfed. 2024-06-11T18:30:22.384107368+08:00 stderr F 2024-06-11 10:30:22.382 WARNING cleanup.py:154 [alice] -- [Anonymous_job] Failed to send ObjectRef(359ec6ce30d3ca2d29217c23c8b16b38f62aba790100000001000000) to bob, error: ray::SenderReceiverProxyActor.send() (pid=660, ip=mgey-jjhodnxr-node-35-0-global.alice.svc, actor_id=29217c23c8b16b38f62aba7901000000, repr=) 2024-06-11T18:30:22.384150637+08:00 stderr F At least one of the input arguments for this task could not be computed: 2024-06-11T18:30:22.384166213+08:00 stderr F ray.exceptions.RayActorError: The actor died unexpectedly before finishing this task. 2024-06-11T18:30:22.384180557+08:00 stderr F class_name: SPURuntime 2024-06-11T18:30:22.384194198+08:00 stderr F actor_id: 6b8ff9d344e3fb1584fa53a701000000 2024-06-11T18:30:22.384207118+08:00 stderr F pid: 728 2024-06-11T18:30:22.384220075+08:00 stderr F namespace: 2d854d99-9d0a-4226-8f15-0b81472f9b80 2024-06-11T18:30:22.384232815+08:00 stderr F ip: mgey-jjhodnxr-node-35-0-global.alice.svc 2024-06-11T18:30:22.384246624+08:00 stderr F The actor is dead because its worker process has died. Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors.,upstream_seq_id: 10#0, downstream_seq_id: 12. 2024-06-11T18:30:22.384757133+08:00 stderr F 2024-06-11 10:30:22.383 INFO cleanup.py:161 [alice] -- [Anonymous_job] Sending error The actor died unexpectedly before finishing this task. 2024-06-11T18:30:22.384779701+08:00 stderr F class_name: SPURuntime 2024-06-11T18:30:22.384794497+08:00 stderr F actor_id: 6b8ff9d344e3fb1584fa53a701000000 2024-06-11T18:30:22.384824194+08:00 stderr F pid: 728 2024-06-11T18:30:22.384838268+08:00 stderr F namespace: 2d854d99-9d0a-4226-8f15-0b81472f9b80 2024-06-11T18:30:22.384857401+08:00 stderr F ip: mgey-jjhodnxr-node-35-0-global.alice.svc 2024-06-11T18:30:22.384880026+08:00 stderr F The actor is dead because its worker process has died. Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors. to bob. 2024-06-11T18:30:22.389288694+08:00 stderr F Exception in thread DataSendingQueueThread: 2024-06-11T18:30:22.38932184+08:00 stderr F Traceback (most recent call last): 2024-06-11T18:30:22.389335686+08:00 stderr F File "/usr/local/lib/python3.10/site-packages/fed/cleanup.py", line 152, in _process_data_sending_task_return 2024-06-11T18:30:22.389966263+08:00 stderr F res = ray.get(obj_ref) 2024-06-11T18:30:22.389990974+08:00 stderr F File "/usr/local/lib/python3.10/site-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper 2024-06-11T18:30:22.390478777+08:00 stderr F return fn(*args, **kwargs) 2024-06-11T18:30:22.390500668+08:00 stderr F File "/usr/local/lib/python3.10/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper 2024-06-11T18:30:22.390836932+08:00 stderr F return func(*args, **kwargs) 2024-06-11T18:30:22.390855982+08:00 stderr F File "/usr/local/lib/python3.10/site-packages/ray/_private/worker.py", line 2624, in get 2024-06-11T18:30:22.392126042+08:00 stderr F raise value.as_instanceof_cause() 2024-06-11T18:30:22.392492683+08:00 stderr F ray.exceptions.RayTaskError(RayActorError): ray::SenderReceiverProxyActor.send() (pid=660, ip=mgey-jjhodnxr-node-35-0-global.alice.svc, actor_id=29217c23c8b16b38f62aba7901000000, repr=) 2024-06-11T18:30:22.392521924+08:00 stderr F At least one of the input arguments for this task could not be computed: 2024-06-11T18:30:22.39253773+08:00 stderr F ray.exceptions.RayActorError: The actor died unexpectedly before finishing this task. 2024-06-11T18:30:22.392551768+08:00 stderr F class_name: SPURuntime 2024-06-11T18:30:22.392564953+08:00 stderr F actor_id: 6b8ff9d344e3fb1584fa53a701000000 2024-06-11T18:30:22.392578086+08:00 stderr F pid: 728 2024-06-11T18:30:22.392591034+08:00 stderr F namespace: 2d854d99-9d0a-4226-8f15-0b81472f9b80 2024-06-11T18:30:22.392603726+08:00 stderr F ip: mgey-jjhodnxr-node-35-0-global.alice.svc 2024-06-11T18:30:22.392617324+08:00 stderr F The actor is dead because its worker process has died. Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors. 2024-06-11T18:30:22.392630099+08:00 stderr F 2024-06-11T18:30:22.392643053+08:00 stderr F During handling of the above exception, another exception occurred: 2024-06-11T18:30:22.392655356+08:00 stderr F 2024-06-11T18:30:22.392667903+08:00 stderr F Traceback (most recent call last): 2024-06-11T18:30:22.392680596+08:00 stderr F File "/usr/local/lib/python3.10/threading.py", line 1016, in _bootstrap_inner 2024-06-11T18:30:22.39297746+08:00 stderr F self.run() 2024-06-11T18:30:22.392999276+08:00 stderr F File "/usr/local/lib/python3.10/threading.py", line 953, in run 2024-06-11T18:30:22.39341552+08:00 stderr F self._target(*self._args, **self._kwargs) 2024-06-11T18:30:22.393433408+08:00 stderr F File "/usr/local/lib/python3.10/site-packages/fed/_private/message_queue.py", line 51, in _loop 2024-06-11T18:30:22.393591242+08:00 stderr F res = self._msg_handler(message) 2024-06-11T18:30:22.39360833+08:00 stderr F File "/usr/local/lib/python3.10/site-packages/fed/cleanup.py", line 47, in 2024-06-11T18:30:22.393729328+08:00 stderr F lambda msg: self._process_data_sending_task_return(msg), 2024-06-11T18:30:22.393745566+08:00 stderr F File "/usr/local/lib/python3.10/site-packages/fed/cleanup.py", line 166, in _process_data_sending_task_return 2024-06-11T18:30:22.394030904+08:00 stderr F send( 2024-06-11T18:30:22.394048257+08:00 stderr F File "/usr/local/lib/python3.10/site-packages/fed/proxy/barriers.py", line 502, in send 2024-06-11T18:30:22.394313962+08:00 stderr F get_global_context().get_cleanup_manager().push_to_sending( 2024-06-11T18:30:22.394848841+08:00 stderr F AttributeError: 'NoneType' object has no attribute 'get_cleanup_manager' 2024-06-11T18:30:23.027373223+08:00 stderr F Traceback (most recent call last): 2024-06-11T18:30:23.027412424+08:00 stderr F File "/usr/local/lib/python3.10/runpy.py", line 196, in _run_module_as_main 2024-06-11T18:30:23.028257897+08:00 stderr F return _run_code(code, main_globals, None, 2024-06-11T18:30:23.028292061+08:00 stderr F File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code 2024-06-11T18:30:23.028710432+08:00 stderr F exec(code, run_globals) 2024-06-11T18:30:23.028732686+08:00 stderr F File "/usr/local/lib/python3.10/site-packages/secretflow/kuscia/entry.py", line 547, in 2024-06-11T18:30:23.029492636+08:00 stderr F main() 2024-06-11T18:30:23.029516684+08:00 stderr F File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1157, in __call__ 2024-06-11T18:30:23.030401661+08:00 stderr F return self.main(*args, **kwargs) 2024-06-11T18:30:23.030426092+08:00 stderr F File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1078, in main 2024-06-11T18:30:23.031112705+08:00 stderr F rv = self.invoke(ctx) 2024-06-11T18:30:23.031163446+08:00 stderr F File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1434, in invoke 2024-06-11T18:30:23.03206022+08:00 stderr F return ctx.invoke(self.callback, **ctx.params) 2024-06-11T18:30:23.032131042+08:00 stderr F File "/usr/local/lib/python3.10/site-packages/click/core.py", line 783, in invoke 2024-06-11T18:30:23.034130743+08:00 stderr F return __callback(*args, **kwargs) 2024-06-11T18:30:23.034152994+08:00 stderr F File "/usr/local/lib/python3.10/site-packages/secretflow/kuscia/entry.py", line 527, in main 2024-06-11T18:30:23.034167818+08:00 stderr F res = comp_eval(sf_node_eval_param, storage_config, sf_cluster_config) 2024-06-11T18:30:23.034183254+08:00 stderr F File "/usr/local/lib/python3.10/site-packages/secretflow/component/entry.py", line 166, in comp_eval 2024-06-11T18:30:23.034197909+08:00 stderr F res = comp.eval( 2024-06-11T18:30:23.034211892+08:00 stderr F File "/usr/local/lib/python3.10/site-packages/secretflow/component/component.py", line 1131, in eval 2024-06-11T18:30:23.034225956+08:00 stderr F raise e from None 2024-06-11T18:30:23.034239893+08:00 stderr F File "/usr/local/lib/python3.10/site-packages/secretflow/component/component.py", line 1126, in eval 2024-06-11T18:30:23.035161218+08:00 stderr F ret = self.__eval_callback(ctx=ctx, **kwargs) 2024-06-11T18:30:23.035203472+08:00 stderr F File "/usr/local/lib/python3.10/site-packages/secretflow/component/preprocessing/data_prep/psi.py", line 371, in two_party_balanced_psi_eval_fn 2024-06-11T18:30:23.035813893+08:00 stderr F report = spu.psi( 2024-06-11T18:30:23.035861924+08:00 stderr F File "/usr/local/lib/python3.10/site-packages/secretflow/device/device/spu.py", line 2097, in psi 2024-06-11T18:30:23.037136402+08:00 stderr F return dispatch( 2024-06-11T18:30:23.0371656+08:00 stderr F File "/usr/local/lib/python3.10/site-packages/secretflow/device/device/register.py", line 111, in dispatch 2024-06-11T18:30:23.037611992+08:00 stderr F return _registrar.dispatch(self.device_type, name, self, *args, **kwargs) 2024-06-11T18:30:23.037657786+08:00 stderr F File "/usr/local/lib/python3.10/site-packages/secretflow/device/device/register.py", line 80, in dispatch 2024-06-11T18:30:23.038128176+08:00 stderr F return self._ops[device_type][name](*args, **kwargs) 2024-06-11T18:30:23.038152447+08:00 stderr F File "/usr/local/lib/python3.10/site-packages/secretflow/device/kernels/spu.py", line 615, in psi 2024-06-11T18:30:23.03889247+08:00 stderr F return sfd.get(res) 2024-06-11T18:30:23.038916451+08:00 stderr F File "/usr/local/lib/python3.10/site-packages/secretflow/distributed/primitive.py", line 156, in get 2024-06-11T18:30:23.03952265+08:00 stderr F return fed.get(object_refs) 2024-06-11T18:30:23.04026802+08:00 stderr F File "/usr/local/lib/python3.10/site-packages/fed/api.py", line 621, in get 2024-06-11T18:30:23.040323982+08:00 stderr F values = ray.get(ray_refs) 2024-06-11T18:30:23.040341076+08:00 stderr F File "/usr/local/lib/python3.10/site-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper 2024-06-11T18:30:23.040490836+08:00 stderr F return fn(*args, **kwargs) 2024-06-11T18:30:23.040540665+08:00 stderr F File "/usr/local/lib/python3.10/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper 2024-06-11T18:30:23.041472187+08:00 stderr F return func(*args, **kwargs) 2024-06-11T18:30:23.041498258+08:00 stderr F File "/usr/local/lib/python3.10/site-packages/ray/_private/worker.py", line 2626, in get 2024-06-11T18:30:23.042555552+08:00 stderr F raise value 2024-06-11T18:30:23.042608214+08:00 stderr F ray.exceptions.RayActorError: The actor died unexpectedly before finishing this task. 2024-06-11T18:30:23.042626101+08:00 stderr F class_name: SPURuntime 2024-06-11T18:30:23.042640465+08:00 stderr F actor_id: 6b8ff9d344e3fb1584fa53a701000000 2024-06-11T18:30:23.042654505+08:00 stderr F pid: 728 2024-06-11T18:30:23.042804129+08:00 stderr F namespace: 2d854d99-9d0a-4226-8f15-0b81472f9b80 2024-06-11T18:30:23.042828124+08:00 stderr F ip: mgey-jjhodnxr-node-35-0-global.alice.svc 2024-06-11T18:30:23.042866871+08:00 stderr F The actor is dead because its worker process has died. Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors. 2024-06-11T18:30:23.044350245+08:00 stdout F (SPURuntime(device_id=None, party=alice) pid=728) [2024-06-11 10:30:21.391] [info] [launch.cc:119] PSI config: {"protocol_config":{"protocol":"PROTOCOL_KKRT","role":"ROLE_RECEIVER","broadcast_result":true},"input_config":{"type":"IO_TYPE_FILE_CSV","path":"/home/kuscia/var/storage/data/breast_new2_590923962.csv"},"output_config":{"type":"IO_TYPE_FILE_CSV","path":"/home/kuscia/var/storage/data/mgey-jjhodnxr-node-35-output-0"},"keys":["id"],"skip_duplicates_check":true,"left_side":"ROLE_RECEIVER"} 2024-06-11T18:30:23.044378596+08:00 stdout F (SPURuntime(device_id=None, party=alice) pid=728) [2024-06-11 10:30:21.391] [info] [receiver.cc:37] [KkrtPsiReceiver::Init] start 2024-06-11T18:30:23.044393529+08:00 stdout F (SPURuntime(device_id=None, party=alice) pid=728) [2024-06-11 10:30:21.391] [info] [interface.cc:78] [AbstractPsiParty::Init] start 2024-06-11T18:30:23.044407927+08:00 stdout F (SPURuntime(device_id=None, party=alice) pid=728) [2024-06-11 10:30:21.397] [info] [interface.cc:136] [AbstractPsiParty::Init][Check csv pre-process] start 2024-06-11T18:30:23.04442234+08:00 stdout F (SPURuntime(device_id=None, party=alice) pid=728) [2024-06-11 10:30:21.401] [info] [interface.cc:145] [AbstractPsiParty::Init][Check csv pre-process] end 2024-06-11T18:30:23.044464964+08:00 stdout F (SPURuntime(device_id=None, party=alice) pid=728) [2024-06-11 10:30:21.407] [info] [interface.cc:183] [AbstractPsiParty::Init] end 2024-06-11T18:30:23.044480279+08:00 stdout F (SPURuntime(device_id=None, party=alice) pid=728) [2024-06-11 10:30:21.407] [info] [receiver.cc:42] [KkrtPsiReceiver::Init] end 2024-06-11T18:30:23.044494139+08:00 stdout F (SPURuntime(device_id=None, party=alice) pid=728) [2024-06-11 10:30:21.407] [info] [receiver.cc:47] [KkrtPsiReceiver::PreProcess] start 2024-06-11T18:30:23.044509146+08:00 stdout F (SPURuntime(device_id=None, party=alice) pid=728) [2024-06-11 10:30:21.408] [info] [bucket_psi.cc:514] psi protocol=2, rank=0 item_size=569 2024-06-11T18:30:23.04452343+08:00 stdout F (SPURuntime(device_id=None, party=alice) pid=728) [2024-06-11 10:30:21.408] [info] [bucket_psi.cc:514] psi protocol=2, rank=1 item_size=569 2024-06-11T18:30:23.044538246+08:00 stdout F (SPURuntime(device_id=None, party=alice) pid=728) [2024-06-11 10:30:21.412] [info] [arrow_csv_batch_provider.cc:75] Reach the end of csv file /home/kuscia/var/storage/data/breast_new2_590923962.csv. 2024-06-11T18:30:23.044552107+08:00 stdout F (SPURuntime(device_id=None, party=alice) pid=728) [2024-06-11 10:30:21.412] [info] [arrow_csv_batch_provider.cc:75] Reach the end of csv file /home/kuscia/var/storage/data/breast_new2_590923962.csv. 2024-06-11T18:30:23.044567371+08:00 stdout F (raylet) A worker died or was killed while executing a task by an unexpected system error. To troubleshoot the problem, check the logs for the dead worker. RayTask ID: ffffffffffffffff6b8ff9d344e3fb1584fa53a701000000 Worker ID: b81e69dc5c23a765d1f5cc5d06e2c59a9cedc33be29ebb1466e66f27 Node ID: c62d983dc71ae67ccb66a649c2e65e1f30b1c3196521987b6ce325b6 Worker IP address: mgey-jjhodnxr-node-35-0-global.alice.svc Worker port: 10014 Worker PID: 728 Worker exit type: SYSTEM_ERROR Worker exit detail: Worker unexpectedly exits with a connection error code 2. End of file. There are some potential root causes. (1) The process is killed by SIGKILL by OOM killer due to high memory usage. (2) ray stop --force is called. (3) The worker is crashed unexpectedly due to SIGSEGV or other unexpected errors. ```
lq0404510 commented 3 months ago

hi @gxcuit 您可以将数据文件脱敏后发出来吗?

gxcuit commented 3 months ago

hi @gxcuit 您可以将数据文件脱敏后发出来吗?

Hi, @lq0404510 Thanks for your reply

breast_new1.csv breast_new2.csv

不知道是不是我机器太老的原因?

宿主机 Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz hyporvisor VMware ESXi, 6.7.0, 8169922

2024-06-11T18:30:21.439034746+08:00 stderr F �[33m(raylet)�[0m [2024-06-11 10:30:17,901 I 728 728] logging.cc:230: Set ray log level from environment variable RAY_BACKEND_LOG_LEVEL to -1
2024-06-11T18:30:21.553161584+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m *** SIGILL received at time=1718101821 on cpu 3 ***
2024-06-11T18:30:21.553204936+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m PC: @     0x7f2373fdfa32  (unknown)  yacl::AvxTranspose128()
2024-06-11T18:30:21.553220273+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m     @     0x7f25dc012ce0  (unknown)  (unknown)
2024-06-11T18:30:21.553266017+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m     @     0x7f2373f399ad       9536  yacl::crypto::IknpOtExtSend()
2024-06-11T18:30:21.666765888+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m     @     0x7f2373f34029        464  psi::kkrt::GetKkrtOtReceiverOptions()
2024-06-11T18:30:21.666827943+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m     @     0x7f2373d60ab0       1536  psi::RunPsi()
2024-06-11T18:30:21.666851651+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m     @     0x7f2373d56205        384  psi::BindLibs()::{lambda()#3}::operator()()
2024-06-11T18:30:21.666873831+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m     @     0x7f2373d56483        176  pybind11::cpp_function::initialize<>()::{lambda()#3}::_FUN()
2024-06-11T18:30:21.666925956+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m     @     0x7f2373d3843d        736  pybind11::cpp_function::dispatcher()
2024-06-11T18:30:21.66694947+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m     @           0x4fc697  (unknown)  cfunction_call
2024-06-11T18:30:21.666971214+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m     @ ... and at least 1 more frames
2024-06-11T18:30:21.666991882+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m [2024-06-11 10:30:21,644 E 728 728] logging.cc:361: *** SIGILL received at time=1718101821 on cpu 3 ***
2024-06-11T18:30:21.667012095+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m [2024-06-11 10:30:21,644 E 728 728] logging.cc:361: PC: @     0x7f2373fdfa32  (unknown)  yacl::AvxTranspose128()
2024-06-11T18:30:21.667052926+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m [2024-06-11 10:30:21,644 E 728 728] logging.cc:361:     @     0x7f25dc012ce0  (unknown)  (unknown)
2024-06-11T18:30:21.667076384+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m [2024-06-11 10:30:21,645 E 728 728] logging.cc:361:     @     0x7f2373f399ad       9536  yacl::crypto::IknpOtExtSend()
2024-06-11T18:30:21.66709696+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m [2024-06-11 10:30:21,645 E 728 728] logging.cc:361:     @     0x7f2373f34029        464  psi::kkrt::GetKkrtOtReceiverOptions()
2024-06-11T18:30:21.667151046+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m [2024-06-11 10:30:21,645 E 728 728] logging.cc:361:     @     0x7f2373d60ab0       1536  psi::RunPsi()
2024-06-11T18:30:21.667214121+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m [2024-06-11 10:30:21,645 E 728 728] logging.cc:361:     @     0x7f2373d56205        384  psi::BindLibs()::{lambda()#3}::operator()()
2024-06-11T18:30:21.667341457+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m [2024-06-11 10:30:21,645 E 728 728] logging.cc:361:     @     0x7f2373d56483        176  pybind11::cpp_function::initialize<>()::{lambda()#3}::_FUN()
2024-06-11T18:30:21.667410986+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m [2024-06-11 10:30:21,645 E 728 728] logging.cc:361:     @     0x7f2373d3843d        736  pybind11::cpp_function::dispatcher()
2024-06-11T18:30:21.66743628+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m [2024-06-11 10:30:21,645 E 728 728] logging.cc:361:     @           0x4fc697  (unknown)  cfunction_call
2024-06-11T18:30:21.667457034+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m [2024-06-11 10:30:21,645 E 728 728] logging.cc:361:     @ ... and at least 1 more frames
2024-06-11T18:30:21.667478061+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m Fatal Python error: Illegal instruction
2024-06-11T18:30:21.667499926+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m 
2024-06-11T18:30:21.667520369+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m Stack (most recent call first):
2024-06-11T18:30:21.667573281+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m   File "/usr/local/lib/python3.10/site-packages/spu/psi.py", line 118 in psi
2024-06-11T18:30:21.667596458+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m   File "/usr/local/lib/python3.10/site-packages/secretflow/device/device/spu.py", line 1379 in psi
2024-06-11T18:30:21.667616909+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m   File "/usr/local/lib/python3.10/site-packages/ray/util/tracing/tracing_helper.py", line 467 in _resume_span
2024-06-11T18:30:21.667636415+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m   File "/usr/local/lib/python3.10/site-packages/ray/_private/function_manager.py", line 726 in actor_method_executor
2024-06-11T18:30:21.667676736+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m   File "/usr/local/lib/python3.10/site-packages/ray/_private/worker.py", line 847 in main_loop
2024-06-11T18:30:21.667698507+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m   File "/usr/local/lib/python3.10/site-packages/ray/_private/workers/default_worker.py", line 282 in <module>
2024-06-11T18:30:21.667746345+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m 
2024-06-11T18:30:21.668545347+08:00 stderr F �[36m(SPURuntime(device_id=None, party=alice) pid=728)�[0m Extension modules: msgpack._cmsgpack, google._upb._message, psutil._psutil_linux, psutil._psutil_posix, setproctitle, yaml._yaml, charset_normalizer.md, requests.packages.charset_normalizer.md, requests.packages.chardet.md, ray._raylet, numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, jaxlib.cpu_feature_guard, grpc._cython.cygrpc, pandas._libs.tslibs.np_datetime, pandas._libs.tslibs.dtypes, pandas._libs.tslibs.base, pandas._libs.tslibs.nattype, pandas._libs.tslibs.timezones, pandas._libs.tslibs.tzconversion, pandas._libs.tslibs.ccalendar, pandas._libs.tslibs.fields, pandas._libs.tslibs.timedeltas, pandas._libs.tslibs.timestamps, pandas._libs.properties, pandas._libs.tslibs.offsets, pandas._libs.tslibs.parsing, pandas._libs.tslibs.conversion, pandas._libs.tslibs.period, pandas._libs.tslibs.vectorized, pandas._libs.ops_dispatch, pandas._libs.missing, pandas._libs.hashtable, pandas._libs.algos, pandas._libs.interval, pandas._libs.tslib, pandas._libs.lib, pandas._libs.hashing, pyarrow.lib, pyarrow._hdfsio, pandas._libs.ops, pyarrow._compute, pandas._libs.arrays, pandas._libs.index, pandas._libs.join, pandas._libs.sparse, pandas._libs.reduction, pandas._libs.indexing, pandas._libs.internals, pandas._libs.writers, pandas._libs.window.aggregations, pandas._libs.window.indexers, pandas._libs.reshape, pandas._libs.tslibs.strptime, pandas._libs.groupby, pandas._libs.testing, pandas._libs.parsers, pandas._libs.json, scipy._lib._ccallback_c, scipy.sparse._sparsetools, _csparsetools, scipy.sparse._csparsetools, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg.cython_lapack, scipy.linalg._cythonized_array_utils, scipy.linalg._solve_toeplitz, scipy.linalg._decomp_lu_cython, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_blas, scipy.linalg._matfuncs_expm, scipy.linalg._decomp_update, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.sparse.linalg._propack._spropack, scipy.sparse.linalg._propack._dpropack, scipy.sparse.linalg._propack._cpropack, scipy.sparse.linalg._propack._zpropack, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, sklearn.__check_build._check_build, scipy.special._ufuncs_cxx, scipy.special._cdflib, scipy.special._ufuncs, scipy.special._specfun, scipy.special._comb, scipy.special._ellip_harm_2, scipy.spatial._ckdtree, scipy._lib.messagestream, scipy.spatial._qhull, scipy.spatial._voronoi, scipy.spatial._distance_wrap, scipy.spatial._hausdorff, scipy.spatial.transform._rotation, scipy.ndimage._nd_image, _ni_label, scipy.ndimage._ni_label, scipy.optimize._minpack2, scipy.optimize._group_columns, scipy.optimize._trlib._trlib, scipy.optimize._lbfgsb, _moduleTNC, scipy.optimize._moduleTNC, scipy.optimize._cobyla, scipy.optimize._slsqp, scipy.optimize._minpack, scipy.optimize._lsq.givens_elimination, scipy.optimize._zeros, scipy.optimize._highs.cython.src._highs_wrapper, scipy.optimize._highs._highs_wrapper, scipy.optimize._highs.cython.src._highs_constants, scipy.optimize._highs._highs_constants, scipy.linalg._interpolative, scipy.optimize._bglu_dense, scipy.optimize._lsap, scipy.optimize._direct, scipy.integrate._odepack, scipy.integrate._quadpack, scipy.integrate._vode, scipy.integrate._dop, scipy.integrate._lsoda, scipy.special.cython_special, scipy.stats._stats, scipy.stats.beta_ufunc, scipy.stats._boost.beta_ufunc, scipy.stats.binom_ufunc, scipy.stats._boost.binom_ufunc, scipy.stats.nbinom_ufunc, scipy.stats._boost.nbinom_ufunc, scipy.stats.hypergeom_ufunc, scipy.stats._boost.hypergeom_ufunc, scipy.stats.ncf_ufunc, scipy.stats._boost.ncf_ufunc, scipy.stats.ncx2_ufunc, scipy.stats._boost.ncx2_ufunc, scipy.stats.nct_ufunc, scipy.stats._boost.nct_ufunc, scipy.stats.skewnorm_ufunc, scipy.stats._boost.skewnorm_ufunc, scipy.stats.invgauss_ufunc, scipy.stats._boost.invgauss_ufunc, scipy.interpolate._fitpack, scipy.interpolate.dfitpack, scipy.interpolate._bspl, scipy.interpolate._ppoly, scipy.interpolate.interpnd, scipy.interpolate._rbfinterp_pythran, scipy.interpolate._rgi_cython, scipy.stats._biasedurn, scipy.stats._levy_stable.levyst, scipy.stats._stats_pythran, scipy._lib._uarray._uarray, scipy.stats._ansari_swilk_statistics, scipy.stats._sobol, scipy.stats._qmc_cy, scipy.stats._mvn, scipy.stats._rcont.rcont, scipy.stats._unuran.unuran_wrapper, sklearn.utils._isfinite, sklearn.utils.murmurhash, sklearn.utils._openmp_helpers, sklearn.utils._logistic_sigmoid, sklearn.utils.sparsefuncs_fast, sklearn.preprocessing._csr_polynomial_expansion, sklearn.preprocessing._target_encoder_fast, pyarrow._json (total: 182)
2024-06-11T18:30:22.373640364+08:00 stderr F 2024-06-11 10:30:22.373 ERROR component.py:1129 [alice] -- [Anonymous_job] eval on domain: "data_prep"
lq0404510 commented 3 months ago

@gxcuit 可以执行下这个命令:cat /proc/cpuinfo | grep avx,看下cpu的信息

gxcuit commented 3 months ago

@gxcuit 可以执行下这个命令:cat /proc/cpuinfo | grep avx,看下cpu的信息

Hi, @lq0404510 Thanks for your reply.

我试过了,有avx,不过之前用FourQ,曾经报过 FourQ requires AVX2 instruction, 换别的曲线就可以了

cat /proc/cpuinfo | grep avx flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm pti ibrs ibpb stibp tsc_adjust arat arch_capabilities flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm pti ibrs ibpb stibp tsc_adjust arat arch_capabilities flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm pti ibrs ibpb stibp tsc_adjust arat arch_capabilities flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm pti ibrs ibpb stibp tsc_adjust arat arch_capabilities flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm pti ibrs ibpb stibp tsc_adjust arat arch_capabilities flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm pti ibrs ibpb stibp tsc_adjust arat arch_capabilities flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm pti ibrs ibpb stibp tsc_adjust arat arch_capabilities flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm pti ibrs ibpb stibp tsc_adjust arat arch_capabilities flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm pti ibrs ibpb stibp tsc_adjust arat arch_capabilities flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm pti ibrs ibpb stibp tsc_adjust arat arch_capabilities

lq0404510 commented 3 months ago

您运行docker的服务器是通过windows创建的虚拟机进行的吗?如果是虚拟机进行运行的,先查看下物理机中是否存在avx avx2,如果有的话,将虚拟机配置中的cpu虚拟化打开,然后重启虚拟机。

gxcuit commented 3 months ago

您运行docker的服务器是通过windows创建的虚拟机进行的吗?如果是虚拟机进行运行的,先查看下物理机中是否存在avx avx2,如果有的话,将虚拟机配置中的cpu虚拟化打开,然后重启虚拟机。

Hi, docker 服务器是Linux 虚拟机, 跑在esxi上,Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz hyporvisor VMware ESXi, 6.7.0, 8169922 。

虚拟机中有avx,无avx2 。

我换一个物理机试一试

lq0404510 commented 3 months ago

您这边换过物理机以后,可以正常运行了吗?

gxcuit commented 3 months ago

您这边换过物理机以后,可以正常运行了吗?

抱歉还没来得及试,试过后在这反馈。Thanks

gxcuit commented 3 months ago

您这边换过物理机以后,可以正常运行了吗?

Hi, @lq0404510

换过物理机后没问题了。但仍然比较奇怪,之前的机器是支持avx的

lq0404510 commented 3 months ago

感谢您的回复,您目前换的物理机的cpu信息可以发下吗?cat /proc/cpuinfo | grep avx,关于您的疑问,我们目前也在分析,如果后续有相关信息,会与您同步