secretflow / secretpad

SecretPad is a privacy-preserving computing web platform based on the Kuscia framework, designed to provide easy access to privacy-preserving data intelligence and machine learning functions.
https://www.secretflow.org.cn
Apache License 2.0
38 stars 25 forks source link

Secretpad部署运行时遇到的问题 #130

Open Meng-xiangkun opened 3 weeks ago

Meng-xiangkun commented 3 weeks ago

Issue Type

Running

Have you searched for existing documents and issues?

Yes

OS Platform and Distribution

Linux centos7

All_in_one Version

Kuscia Version

0.10.0b0

What happend and What you expected to happen.

Secretpad源码打包镜像,已经部署了0.10.0b0版本的kuscia,kuscia部署了master节点和Alice、bob节点,启动Secretpad时遇到了一些问题

Log output.

_                   _
                       | |                 | |      secretpad  https://www.secretflow.org.cn/
 ___  ___  ___ _ __ ___| |_ _ __   __ _  __| |      Running in ALL-IN-ONE mode, CENTER function modules
/ __|/ _ \/ __| '__/ _ \ __| '_ \ / _` |/ _` |      Port: 8080
\__ \  __/ (__| | |  __/ |_| |_) | (_| | (_| |      Pid: 1
|___/\___|\___|_|  \___|\__| .__/ \__,_|\__,_|      Console: http://127.0.0.1:8080/
                           | |
                           |_|

secretpad  version: 0.5.0b0
kuscia     version: 0.6.0b0
secretflow version: 1.4.0b0
2024-08-30T11:02:11.736+08:00  INFO 1 --- [           main] o.s.secretpad.web.SecretPadApplication   : Starting SecretPadApplication v0.0.1-SNAPSHOT using Java 17.0.11 with PID 1 (/app/secretpad.jar started by root in /app)
2024-08-30T11:02:11.740+08:00  INFO 1 --- [           main] o.s.secretpad.web.SecretPadApplication   : The following 1 profile is active: "dev"
2024-08-30T11:02:13.143+08:00  INFO 1 --- [           main] .s.d.r.c.RepositoryConfigurationDelegate : Bootstrapping Spring Data JPA repositories in DEFAULT mode.
2024-08-30T11:02:13.516+08:00  INFO 1 --- [           main] .s.d.r.c.RepositoryConfigurationDelegate : Finished Spring Data repository scanning in 364 ms. Found 38 JPA repository interfaces.
2024-08-30T11:02:15.763+08:00  INFO 1 --- [           main] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat initialized with port(s): 443 (https) 8080 (http) 9001 (http)
2024-08-30T11:02:15.881+08:00  INFO 1 --- [           main] w.s.c.ServletWebServerApplicationContext : Root WebApplicationContext: initialization completed in 4036 ms
2024-08-30T11:02:18.844+08:00  INFO 1 --- [           main] o.s.o.j.p.SpringPersistenceUnitInfo      : No LoadTimeWeaver setup: ignoring JPA class transformer
2024-08-30T11:02:21.442+08:00  INFO 1 --- [           main] j.LocalContainerEntityManagerFactoryBean : Initialized JPA EntityManagerFactory for persistence unit 'default'
2024-08-30T11:02:21.904+08:00  INFO 1 --- [           main] o.s.d.j.r.query.QueryEnhancerFactory     : Hibernate is in classpath; If applicable, HQL parser will be used.
2024-08-30T11:02:23.288+08:00  INFO 1 --- [           main] o.s.s.k.v.DynamicKusciaChannelProvider   : Init kuscia node, config=KusciaGrpcConfig(domainId=kuscia-system, host=10.233.74.6, port=8083, protocol=NOTLS, mode=MASTER, token=config/certs/token, certFile=config/certs/client.crt, keyFile=config/certs/client.pem)
2024-08-30T11:02:23.337+08:00  INFO 1 --- [           main] o.s.s.k.v.DynamicKusciaChannelProvider   : Register kuscia node success, config=KusciaGrpcConfig(domainId=kuscia-system, host=10.233.74.6, port=8083, protocol=NOTLS, mode=MASTER, token=config/certs/token, certFile=config/certs/client.crt, keyFile=config/certs/client.pem)
2024-08-30T11:02:23.351+08:00  INFO 1 --- [           main] o.s.s.k.v.DynamicKusciaChannelProvider   : Init kuscia node success, CHANNEL_FACTORIES={kuscia-system=org.secretflow.secretpad.kuscia.v1alpha1.factory.impl.GrpcKusciaApiChannelFactory@1b557402}
2024-08-30T11:02:23.351+08:00  INFO 1 --- [           main] o.s.s.k.v.DynamicKusciaChannelProvider   : Init kuscia node, config=KusciaGrpcConfig(domainId=alice, host=10.233.74.238, port=8083, protocol=NOTLS, mode=LITE, token=config/certs/alice/token, certFile=config/certs/alice/client.crt, keyFile=config/certs/alice/client.pem)
2024-08-30T11:02:23.356+08:00  INFO 1 --- [           main] o.s.s.k.v.DynamicKusciaChannelProvider   : Register kuscia node success, config=KusciaGrpcConfig(domainId=alice, host=10.233.74.238, port=8083, protocol=NOTLS, mode=LITE, token=config/certs/alice/token, certFile=config/certs/alice/client.crt, keyFile=config/certs/alice/client.pem)
2024-08-30T11:02:23.361+08:00  INFO 1 --- [           main] o.s.s.k.v.DynamicKusciaChannelProvider   : Init kuscia node success, CHANNEL_FACTORIES={kuscia-system=org.secretflow.secretpad.kuscia.v1alpha1.factory.impl.GrpcKusciaApiChannelFactory@1b557402, alice=org.secretflow.secretpad.kuscia.v1alpha1.factory.impl.GrpcKusciaApiChannelFactory@41cfcbb5}
2024-08-30T11:02:23.361+08:00  INFO 1 --- [           main] o.s.s.k.v.DynamicKusciaChannelProvider   : Init kuscia node, config=KusciaGrpcConfig(domainId=bob, host=10.233.74.149, port=8083, protocol=NOTLS, mode=LITE, token=config/certs/bob/token, certFile=config/certs/bob/client.crt, keyFile=config/certs/bob/client.pem)
2024-08-30T11:02:23.366+08:00  INFO 1 --- [           main] o.s.s.k.v.DynamicKusciaChannelProvider   : Register kuscia node success, config=KusciaGrpcConfig(domainId=bob, host=10.233.74.149, port=8083, protocol=NOTLS, mode=LITE, token=config/certs/bob/token, certFile=config/certs/bob/client.crt, keyFile=config/certs/bob/client.pem)
2024-08-30T11:02:23.370+08:00  INFO 1 --- [           main] o.s.s.k.v.DynamicKusciaChannelProvider   : Init kuscia node success, CHANNEL_FACTORIES={bob=org.secretflow.secretpad.kuscia.v1alpha1.factory.impl.GrpcKusciaApiChannelFactory@7f9083b4, kuscia-system=org.secretflow.secretpad.kuscia.v1alpha1.factory.impl.GrpcKusciaApiChannelFactory@1b557402, alice=org.secretflow.secretpad.kuscia.v1alpha1.factory.impl.GrpcKusciaApiChannelFactory@41cfcbb5}
2024-08-30T11:02:28.802+08:00  INFO 1 --- [           main] o.s.s.p.c.PersistenceConfiguration       : making sure database is WAL mode
2024-08-30T11:02:28.853+08:00  WARN 1 --- [           main] o.s.s.s.factory.CloudLogServiceFactory   : cloud service configuration is not available,please check your configuration,like ak,sk,host
2024-08-30T11:02:29.198+08:00  INFO 1 --- [           main] o.s.b.a.w.s.WelcomePageHandlerMapping    : Adding welcome page template: index
2024-08-30T11:02:29.866+08:00  INFO 1 --- [           main] o.s.b.a.e.web.EndpointLinksResolver      : Exposing 0 endpoint(s) beneath base path '/actuator'
2024-08-30T11:02:30.117+08:00  INFO 1 --- [           main] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat started on port(s): 443 (https) 8080 (http) 9001 (http) with context path ''
2024-08-30T11:02:30.141+08:00  INFO 1 --- [           main] o.s.secretpad.web.SecretPadApplication   : Started SecretPadApplication in 19.666 seconds (process running for 21.177)
2024-08-30T11:02:30.334+08:00  INFO 1 --- [   scheduling-2] o.s.s.k.v.DynamicKusciaChannelProvider   : session UserContextDTO(token=null, name=null, platformType=null, platformNodeId=null, ownerType=null, ownerId=kuscia-system, projectIds=null, apiResources=null, virtualUserForNode=false, deployMode=null)
2024-08-30T11:02:30.425+08:00  INFO 1 --- [           main] o.s.s.web.init.DynamicBeanRegisterInit   : all mvc mapping [{POST [/api/v1alpha1/project/datatable/get]}, {POST [/api/v1alpha1/message/pending], consumes [application/json]}, {GET [/swagger-ui.html]}, {POST [/api/v1alpha1/model/status], consumes [application/json], produces [application/json]}, {POST [/api/v1alpha1/project/datasource/list], consumes [application/json]}, {POST [/api/v1alpha1/project/tee/list]}, {POST [/api/v1alpha1/graph/node/output]}, {POST [/api/v1alpha1/model/discard], consumes [application/json]}, {POST [/api/v1alpha1/model/detail], consumes [application/json]}, {POST [/api/v1alpha1/datatable/pushToTee], consumes [application/json]}, {POST [/api/v1alpha1/project/job/get]}, {POST [/api/v1alpha1/model/serving/create], consumes [application/json]}, {POST [/api/v1alpha1/p2p/node/create], consumes [application/json]}, {POST [/api/v1alpha1/user/node/resetPassword], consumes [application/json]}, {POST [/api/v1alpha1/node/token], consumes [application/json]}, {POST [/api/v1alpha1/graph/detail]}, {GET [/edge || / || /model-submission/** || /message/** || /my-node/** || /logout/** || /record/** || /guide/** || /login/** || /edge/** || /home/** || /node/** || /dag/**]}, {POST [/api/v1alpha1/model/serving/delete], consumes [application/json]}, {POST [/api/v1alpha1/p2p/node/delete], consumes [application/json]}, {POST [/api/v1alpha1/message/reply], consumes [application/json]}, {POST [/api/v1alpha1/graph/node/update]}, {POST [/api/v1alpha1/project/job/stop]}, {POST [/api/v1alpha1/graph/node/max_index]}, {POST [/api/v1alpha1/project/datatable/delete]}, {POST [/api/v1alpha1/user/get]}, {POST [/api/v1alpha1/project/inst/add]}, {GET [/v3/api-docs/swagger-config], produces [application/json]}, {GET [/v3/api-docs], produces [application/json]}, {POST [/api/v1alpha1/datasource/list]}, {GET [/v3/api-docs.yaml], produces [application/vnd.oai.openapi]}, {POST [/api/v1alpha1/p2p/project/create], consumes [application/json]}, {POST [/api/v1alpha1/data/upload], consumes [multipart/form-data]}, {POST [/api/v1alpha1/graph/create]}, {POST [/api/v1alpha1/project/update/tableConfig], consumes [application/json]}, {GET [/sync], produces [text/event-stream]}, {POST [/api/v1alpha1/vote_sync/create], consumes [application/json]}, {POST [/api/v1alpha1/model/delete], consumes [application/json]}, {POST [/api/v1alpha1/project/job/list]}, {POST [/api/v1alpha1/project/job/task/logs]}, {POST [/api/v1alpha1/p2p/project/update], consumes [application/json]}, {POST [/api/v1alpha1/graph/update]}, {POST [/api/v1alpha1/nodeRoute/get], consumes [application/json]}, {POST [/api/v1alpha1/graph/meta/update]}, {POST [/api/v1alpha1/graph/delete]}, {POST [/api/v1alpha1/datasource/detail]}, {POST [/api/v1alpha1/node/refresh], consumes [application/json]}, {POST [/api/v1alpha1/model/pack], consumes [application/json], produces [application/json]}, {POST [/api/v1alpha1/component/i18n]}, {POST [/api/v1alpha1/message/list], consumes [application/json]}, {POST [/api/v1alpha1/nodeRoute/refresh], consumes [application/json]}, {POST [/api/v1alpha1/p2p/project/list], consumes [application/json]}, {POST [/api/v1alpha1/project/job/task/output]}, {POST [/api/login], consumes [application/json]}, {POST [/api/v1alpha1/project/node/add]}, {POST [/api/v1alpha1/p2p/project/archive], consumes [application/json]}, {POST [/api/v1alpha1/user/updatePwd]}, {POST [/api/v1alpha1/graph/stop]}, {POST [/api/logout], consumes [application/json]}, {POST [/api/v1alpha1/message/detail], consumes [application/json]}, {POST [/api/v1alpha1/node/get], consumes [application/json]}, {POST [/api/v1alpha1/graph/node/status]}, {POST [/api/v1alpha1/cloud_log/sls]}, {POST [/api/v1alpha1/datasource/create]}, {POST [/api/v1alpha1/model/modelPartyPath], consumes [application/json], produces [application/json]}, {POST [/api/v1alpha1/node/result/detail], consumes [application/json]}, {POST [/api/v1alpha1/feature_datasource/auth/list], consumes [application/json]}, { [/error], produces [text/html]}, {POST [/api/v1alpha1/version/list]}, {POST [/api/v1alpha1/datasource/delete]}, {POST [/api/v1alpha1/datatable/create], consumes [application/json]}, {POST [/api/v1alpha1/project/getOutTable], consumes [application/json]}, {POST [/api/v1alpha1/data/create], consumes [application/json]}, {POST [/api/v1alpha1/node/result/list], consumes [application/json]}, {POST [/api/v1alpha1/data/sync]}, {POST [/api/v1alpha1/datatable/get], consumes [application/json]}, {POST [/api/v1alpha1/graph/node/logs]}, {POST [/api/v1alpha1/graph/list]}, {POST [/api/v1alpha1/model/page], consumes [application/json]}, {POST [/api/v1alpha1/model/info], consumes [application/json]}, {POST [/api/v1alpha1/project/get], consumes [application/json]}, {POST [/api/v1alpha1/datatable/delete], consumes [application/json]}, {POST [/api/v1alpha1/node/create], consumes [application/json]}, {POST [/api/v1alpha1/node/update], consumes [application/json]}, {POST [/api/v1alpha1/datatable/list], consumes [application/json]}, {POST [/api/v1alpha1/project/datatable/add]}, {POST [/api/v1alpha1/nodeRoute/update], consumes [application/json]}, {POST [/api/v1alpha1/component/batch]}, {POST [/api/v1alpha1/approval/create], consumes [application/json]}, {POST [/api/v1alpha1/nodeRoute/delete], consumes [application/json]}, {POST [/api/v1alpha1/node/delete], consumes [application/json]}, {POST [/api/v1alpha1/project/create], consumes [application/json]}, {POST [/api/v1alpha1/graph/start]}, {POST [/api/v1alpha1/project/update], consumes [application/json]}, {POST [/api/v1alpha1/node/page], consumes [application/json]}, {POST [/api/v1alpha1/user/remote/resetPassword], consumes [application/json]}, {POST [/api/v1alpha1/model/serving/detail], consumes [application/json]}, {POST [/api/v1alpha1/data/download]}, {POST [/api/v1alpha1/node/newToken], consumes [application/json]}, {POST [/api/v1alpha1/project/delete], consumes [application/json]}, {POST [/api/v1alpha1/nodeRoute/page], consumes [application/json]}, {POST [/api/v1alpha1/nodeRoute/listNode]}, {POST [/api/v1alpha1/approval/pull/status], consumes [application/json]}, {POST [/api/v1alpha1/project/list], consumes [application/json]}, {POST [/api/v1alpha1/node/list]}, { [/error]}, {POST [/api/v1alpha1/feature_datasource/create], consumes [application/json]}, {POST [/api/v1alpha1/component/list]}]
2024-08-30T11:02:30.427+08:00  INFO 1 --- [           main] o.s.s.web.init.DynamicBeanRegisterInit   : after unregister all mvc mapping [{POST [/api/v1alpha1/project/datatable/get]}, {POST [/api/v1alpha1/message/pending], consumes [application/json]}, {GET [/swagger-ui.html]}, {POST [/api/v1alpha1/model/status], consumes [application/json], produces [application/json]}, {POST [/api/v1alpha1/project/datasource/list], consumes [application/json]}, {POST [/api/v1alpha1/project/tee/list]}, {POST [/api/v1alpha1/graph/node/output]}, {POST [/api/v1alpha1/model/discard], consumes [application/json]}, {POST [/api/v1alpha1/model/detail], consumes [application/json]}, {POST [/api/v1alpha1/datatable/pushToTee], consumes [application/json]}, {POST [/api/v1alpha1/project/job/get]}, {POST [/api/v1alpha1/model/serving/create], consumes [application/json]}, {POST [/api/v1alpha1/p2p/node/create], consumes [application/json]}, {POST [/api/v1alpha1/user/node/resetPassword], consumes [application/json]}, {POST [/api/v1alpha1/node/token], consumes [application/json]}, {POST [/api/v1alpha1/graph/detail]}, {GET [/edge || / || /model-submission/** || /message/** || /my-node/** || /logout/** || /record/** || /guide/** || /login/** || /edge/** || /home/** || /node/** || /dag/**]}, {POST [/api/v1alpha1/model/serving/delete], consumes [application/json]}, {POST [/api/v1alpha1/p2p/node/delete], consumes [application/json]}, {POST [/api/v1alpha1/message/reply], consumes [application/json]}, {POST [/api/v1alpha1/graph/node/update]}, {POST [/api/v1alpha1/project/job/stop]}, {POST [/api/v1alpha1/graph/node/max_index]}, {POST [/api/v1alpha1/project/datatable/delete]}, {POST [/api/v1alpha1/user/get]}, {POST [/api/v1alpha1/project/inst/add]}, {GET [/v3/api-docs/swagger-config], produces [application/json]}, {GET [/v3/api-docs], produces [application/json]}, {POST [/api/v1alpha1/datasource/list]}, {GET [/v3/api-docs.yaml], produces [application/vnd.oai.openapi]}, {POST [/api/v1alpha1/p2p/project/create], consumes [application/json]}, {POST [/api/v1alpha1/data/upload], consumes [multipart/form-data]}, {POST [/api/v1alpha1/graph/create]}, {POST [/api/v1alpha1/project/update/tableConfig], consumes [application/json]}, {GET [/sync], produces [text/event-stream]}, {POST [/api/v1alpha1/vote_sync/create], consumes [application/json]}, {POST [/api/v1alpha1/model/delete], consumes [application/json]}, {POST [/api/v1alpha1/project/job/list]}, {POST [/api/v1alpha1/project/job/task/logs]}, {POST [/api/v1alpha1/p2p/project/update], consumes [application/json]}, {POST [/api/v1alpha1/graph/update]}, {POST [/api/v1alpha1/nodeRoute/get], consumes [application/json]}, {POST [/api/v1alpha1/graph/meta/update]}, {POST [/api/v1alpha1/graph/delete]}, {POST [/api/v1alpha1/datasource/detail]}, {POST [/api/v1alpha1/node/refresh], consumes [application/json]}, {POST [/api/v1alpha1/model/pack], consumes [application/json], produces [application/json]}, {POST [/api/v1alpha1/component/i18n]}, {POST [/api/v1alpha1/message/list], consumes [application/json]}, {POST [/api/v1alpha1/nodeRoute/refresh], consumes [application/json]}, {POST [/api/v1alpha1/p2p/project/list], consumes [application/json]}, {POST [/api/v1alpha1/project/job/task/output]}, {POST [/api/login], consumes [application/json]}, {POST [/api/v1alpha1/project/node/add]}, {POST [/api/v1alpha1/p2p/project/archive], consumes [application/json]}, {POST [/api/v1alpha1/user/updatePwd]}, {POST [/api/v1alpha1/graph/stop]}, {POST [/api/logout], consumes [application/json]}, {POST [/api/v1alpha1/message/detail], consumes [application/json]}, {POST [/api/v1alpha1/node/get], consumes [application/json]}, {POST [/api/v1alpha1/graph/node/status]}, {POST [/api/v1alpha1/cloud_log/sls]}, {POST [/api/v1alpha1/datasource/create]}, {POST [/api/v1alpha1/model/modelPartyPath], consumes [application/json], produces [application/json]}, {POST [/api/v1alpha1/node/result/detail], consumes [application/json]}, {POST [/api/v1alpha1/feature_datasource/auth/list], consumes [application/json]}, { [/error], produces [text/html]}, {POST [/api/v1alpha1/version/list]}, {POST [/api/v1alpha1/datasource/delete]}, {POST [/api/v1alpha1/datatable/create], consumes [application/json]}, {POST [/api/v1alpha1/project/getOutTable], consumes [application/json]}, {POST [/api/v1alpha1/data/create], consumes [application/json]}, {POST [/api/v1alpha1/node/result/list], consumes [application/json]}, {POST [/api/v1alpha1/data/sync]}, {POST [/api/v1alpha1/datatable/get], consumes [application/json]}, {POST [/api/v1alpha1/graph/node/logs]}, {POST [/api/v1alpha1/graph/list]}, {POST [/api/v1alpha1/model/page], consumes [application/json]}, {POST [/api/v1alpha1/model/info], consumes [application/json]}, {POST [/api/v1alpha1/project/get], consumes [application/json]}, {POST [/api/v1alpha1/datatable/delete], consumes [application/json]}, {POST [/api/v1alpha1/node/create], consumes [application/json]}, {POST [/api/v1alpha1/node/update], consumes [application/json]}, {POST [/api/v1alpha1/datatable/list], consumes [application/json]}, {POST [/api/v1alpha1/project/datatable/add]}, {POST [/api/v1alpha1/nodeRoute/update], consumes [application/json]}, {POST [/api/v1alpha1/component/batch]}, {POST [/api/v1alpha1/approval/create], consumes [application/json]}, {POST [/api/v1alpha1/nodeRoute/delete], consumes [application/json]}, {POST [/api/v1alpha1/node/delete], consumes [application/json]}, {POST [/api/v1alpha1/project/create], consumes [application/json]}, {POST [/api/v1alpha1/graph/start]}, {POST [/api/v1alpha1/project/update], consumes [application/json]}, {POST [/api/v1alpha1/node/page], consumes [application/json]}, {POST [/api/v1alpha1/user/remote/resetPassword], consumes [application/json]}, {POST [/api/v1alpha1/model/serving/detail], consumes [application/json]}, {POST [/api/v1alpha1/data/download]}, {POST [/api/v1alpha1/node/newToken], consumes [application/json]}, {POST [/api/v1alpha1/project/delete], consumes [application/json]}, {POST [/api/v1alpha1/nodeRoute/page], consumes [application/json]}, {POST [/api/v1alpha1/nodeRoute/listNode]}, {POST [/api/v1alpha1/approval/pull/status], consumes [application/json]}, {POST [/api/v1alpha1/project/list], consumes [application/json]}, {POST [/api/v1alpha1/node/list]}, { [/error]}, {POST [/api/v1alpha1/feature_datasource/create], consumes [application/json]}, {POST [/api/v1alpha1/component/list]}]
2024-08-30T11:02:30.463+08:00  INFO 1 --- [           main] o.s.secretpad.web.init.MasterRouteInit   : kuscia protocol: https://
2024-08-30T11:02:30.478+08:00  INFO 1 --- [           main] o.s.secretpad.web.init.MasterRouteInit   : update node router id: 1, srcNetAddress is:https://127.0.0.1:28080, dstNetAddress is:https://127.0.0.1:38080
2024-08-30T11:02:30.487+08:00  INFO 1 --- [           main] o.s.secretpad.web.init.MasterRouteInit   : update node router id: 2, srcNetAddress is:https://127.0.0.1:38080, dstNetAddress is:https://127.0.0.1:28080
2024-08-30T11:02:30.487+08:00  INFO 1 --- [           main] o.s.secretpad.web.init.TeeResourceInit   : init tee node ALL-IN-ONE CENTER
2024-08-30T11:02:30.503+08:00  INFO 1 --- [           main] o.s.s.k.v.DynamicKusciaChannelProvider   : session UserContextDTO(token=null, name=null, platformType=null, platformNodeId=null, ownerType=null, ownerId=kuscia-system, projectIds=null, apiResources=null, virtualUserForNode=false, deployMode=null)
2024-08-30T11:02:30.716+08:00  INFO 1 --- [   scheduling-2] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system  Calling method: kuscia.proto.api.v1alpha1.kusciaapi.DomainRouteService/BatchQueryDomainRouteStatus
2024-08-30T11:02:30.717+08:00  INFO 1 --- [           main] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system  Calling method: kuscia.proto.api.v1alpha1.kusciaapi.DomainService/QueryDomain
2024-08-30T11:02:30.738+08:00  INFO 1 --- [ault-executor-2] o.s.s.k.v.l.ManagedChannelStateListener  : [kuscia] kuscia-system Channel state changed from IDLE to CONNECTING
2024-08-30T11:02:30.738+08:00  INFO 1 --- [ault-executor-3] o.s.s.k.v.l.ManagedChannelStateListener  : [kuscia] kuscia-system Channel state changed from IDLE to CONNECTING
2024-08-30T11:02:30.748+08:00  INFO 1 --- [   scheduling-2] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Request: 
2024-08-30T11:02:30.748+08:00  INFO 1 --- [           main] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Request: domain_id: "tee"

2024-08-30T11:02:31.005+08:00  INFO 1 --- [ault-executor-0] o.s.s.k.v.l.ManagedChannelStateListener  : [kuscia] kuscia-system Channel state changed from CONNECTING to READY
2024-08-30T11:02:31.005+08:00  INFO 1 --- [ault-executor-1] o.s.s.k.v.l.ManagedChannelStateListener  : [kuscia] kuscia-system Channel state changed from CONNECTING to READY
2024-08-30T11:02:31.097+08:00  INFO 1 --- [   scheduling-2] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Response: status {
  code: 11100
  message: "DomainRoute keys can not be empty"
}

2024-08-30T11:02:31.115+08:00  INFO 1 --- [           main] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Response: status {
  message: "success"
}
data {
  domain_id: "tee"
  deploy_token_statuses {
    token: "kYdZaMHb8FYkCKJNgAetz9KQhvZbFzyq"
    state: "unused"
    last_transition_time: "2024-08-30T02:58:45Z"
  }
  auth_center {
    authentication_type: "Token"
    token_gen_method: "UID-RSA-GEN"
  }
}

2024-08-30T11:02:31.139+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** before EntityChangeListener.DbChangeEvent(dstNode=null, action=update, dType=org.secretflow.secretpad.persistence.entity.NodeRouteDO, projectId=null, nodeIds=[alice, bob], source=NodeRouteDO(srcNodeId=alice, dstNodeId=bob, routeId=1, srcNetAddress=https://127.0.0.1:28080, dstNetAddress=https://127.0.0.1:38080)) will be send to [alice, bob]
2024-08-30T11:02:31.139+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , filter [alice, bob] will be send
2024-08-30T11:02:31.140+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , start to send *** EntityChangeListener.DbChangeEvent(dstNode=null, action=update, dType=org.secretflow.secretpad.persistence.entity.NodeRouteDO, projectId=null, nodeIds=[alice, bob], source=NodeRouteDO(srcNodeId=alice, dstNodeId=bob, routeId=1, srcNetAddress=https://127.0.0.1:28080, dstNetAddress=https://127.0.0.1:38080)) , 4 wait to sync
2024-08-30T11:02:31.141+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** before EntityChangeListener.DbChangeEvent(dstNode=null, action=update, dType=org.secretflow.secretpad.persistence.entity.NodeRouteDO, projectId=null, nodeIds=[bob, alice], source=NodeRouteDO(srcNodeId=bob, dstNodeId=alice, routeId=2, srcNetAddress=https://127.0.0.1:38080, dstNetAddress=https://127.0.0.1:28080)) will be send to [bob, alice]
2024-08-30T11:02:31.141+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , filter [bob, alice] will be send
2024-08-30T11:02:31.141+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , start to send *** EntityChangeListener.DbChangeEvent(dstNode=null, action=update, dType=org.secretflow.secretpad.persistence.entity.NodeRouteDO, projectId=null, nodeIds=[bob, alice], source=NodeRouteDO(srcNodeId=bob, dstNodeId=alice, routeId=2, srcNetAddress=https://127.0.0.1:38080, dstNetAddress=https://127.0.0.1:28080)) , 3 wait to sync
2024-08-30T11:02:31.141+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , filter [] will be send
2024-08-30T11:02:31.142+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , start to send *** EntityChangeListener.DbChangeEvent(dstNode=null, action=create, dType=org.secretflow.secretpad.persistence.entity.NodeDO, projectId=null, nodeIds=[tee], source=NodeDO(nodeId=tee, name=tee, auth=tee, description=tee, masterNodeId=null, controlNodeId=tee, netAddress=127.0.0.1:48080, token=null, type=embedded, mode=0)) , 2 wait to sync
2024-08-30T11:02:31.142+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** before EntityChangeListener.DbChangeEvent(dstNode=null, action=create, dType=org.secretflow.secretpad.persistence.entity.NodeRouteDO, projectId=null, nodeIds=[alice, tee], source=NodeRouteDO(srcNodeId=alice, dstNodeId=tee, routeId=3, srcNetAddress=127.0.0.1:28080, dstNetAddress=127.0.0.1:48080)) will be send to [alice, tee]
2024-08-30T11:02:31.142+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , filter [alice, tee] will be send
2024-08-30T11:02:31.143+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , start to send *** EntityChangeListener.DbChangeEvent(dstNode=null, action=create, dType=org.secretflow.secretpad.persistence.entity.NodeRouteDO, projectId=null, nodeIds=[alice, tee], source=NodeRouteDO(srcNodeId=alice, dstNodeId=tee, routeId=3, srcNetAddress=127.0.0.1:28080, dstNetAddress=127.0.0.1:48080)) , 1 wait to sync
2024-08-30T11:02:31.143+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** before EntityChangeListener.DbChangeEvent(dstNode=null, action=create, dType=org.secretflow.secretpad.persistence.entity.NodeRouteDO, projectId=null, nodeIds=[tee, alice], source=NodeRouteDO(srcNodeId=tee, dstNodeId=alice, routeId=4, srcNetAddress=127.0.0.1:48080, dstNetAddress=127.0.0.1:28080)) will be send to [tee, alice]
2024-08-30T11:02:31.143+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , filter [tee, alice] will be send
2024-08-30T11:02:31.143+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , start to send *** EntityChangeListener.DbChangeEvent(dstNode=null, action=create, dType=org.secretflow.secretpad.persistence.entity.NodeRouteDO, projectId=null, nodeIds=[tee, alice], source=NodeRouteDO(srcNodeId=tee, dstNodeId=alice, routeId=4, srcNetAddress=127.0.0.1:48080, dstNetAddress=127.0.0.1:28080)) , 0 wait to sync
2024-08-30T11:02:31.146+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** before EntityChangeListener.DbChangeEvent(dstNode=null, action=create, dType=org.secretflow.secretpad.persistence.entity.NodeRouteDO, projectId=null, nodeIds=[bob, tee], source=NodeRouteDO(srcNodeId=bob, dstNodeId=tee, routeId=5, srcNetAddress=127.0.0.1:38080, dstNetAddress=127.0.0.1:48080)) will be send to [bob, tee]
2024-08-30T11:02:31.147+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , filter [bob, tee] will be send
2024-08-30T11:02:31.148+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , start to send *** EntityChangeListener.DbChangeEvent(dstNode=null, action=create, dType=org.secretflow.secretpad.persistence.entity.NodeRouteDO, projectId=null, nodeIds=[bob, tee], source=NodeRouteDO(srcNodeId=bob, dstNodeId=tee, routeId=5, srcNetAddress=127.0.0.1:38080, dstNetAddress=127.0.0.1:48080)) , 0 wait to sync
2024-08-30T11:02:31.152+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** before EntityChangeListener.DbChangeEvent(dstNode=null, action=create, dType=org.secretflow.secretpad.persistence.entity.NodeRouteDO, projectId=null, nodeIds=[tee, bob], source=NodeRouteDO(srcNodeId=tee, dstNodeId=bob, routeId=6, srcNetAddress=127.0.0.1:48080, dstNetAddress=127.0.0.1:38080)) will be send to [tee, bob]
2024-08-30T11:02:31.152+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , filter [tee, bob] will be send
2024-08-30T11:02:31.152+08:00  INFO 1 --- [   scheduling-3] o.s.s.s.listener.DbChangeEventListener   : *** get data sync , start to send *** EntityChangeListener.DbChangeEvent(dstNode=null, action=create, dType=org.secretflow.secretpad.persistence.entity.NodeRouteDO, projectId=null, nodeIds=[tee, bob], source=NodeRouteDO(srcNodeId=tee, dstNodeId=bob, routeId=6, srcNetAddress=127.0.0.1:48080, dstNetAddress=127.0.0.1:38080)) , 0 wait to sync
2024-08-30T11:02:31.155+08:00  INFO 1 --- [           main] o.s.secretpad.web.init.TeeResourceInit   : push alice-table datatable to tee node
2024-08-30T11:02:31.156+08:00  INFO 1 --- [           main] o.s.s.service.impl.DatatableServiceImpl  : Push datatable to teeNode with node id = alice, datatable id = alice-table
2024-08-30T11:02:31.196+08:00  INFO 1 --- [           main] o.s.s.k.v.DynamicKusciaChannelProvider   : session UserContextDTO(token=null, name=null, platformType=null, platformNodeId=null, ownerType=null, ownerId=kuscia-system, projectIds=null, apiResources=null, virtualUserForNode=false, deployMode=null)
2024-08-30T11:02:31.201+08:00  INFO 1 --- [           main] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system  Calling method: kuscia.proto.api.v1alpha1.kusciaapi.DomainDataGrantService/CreateDomainDataGrant
2024-08-30T11:02:31.202+08:00  INFO 1 --- [           main] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Request: domaindata_id: "alice-table"
grant_domain: "tee"
domain_id: "alice"

2024-08-30T11:02:31.226+08:00  INFO 1 --- [           main] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Response: status {
  code: 11100
  message: "domaindata [alice-table] not exists"
}

2024-08-30T11:02:31.229+08:00 ERROR 1 --- [           main] o.s.s.m.i.d.DatatableGrantManager        : create domain grant from kusciaapi failed: code=11100, message=domaindata [alice-table] not exists, nodeId=alice, grantNodeId=tee, domainDataId=alice-table
2024-08-30T11:02:31.236+08:00  INFO 1 --- [           main] .s.b.a.l.ConditionEvaluationReportLogger : 

Error starting ApplicationContext. To display the condition evaluation report re-run your application with 'debug' enabled.
2024-08-30T11:02:31.257+08:00 ERROR 1 --- [           main] o.s.boot.SpringApplication               : Application run failed

org.secretflow.secretpad.common.exception.SecretpadException: 
    at org.secretflow.secretpad.common.exception.SecretpadException.of(SecretpadException.java:58)
    at org.secretflow.secretpad.manager.integration.datatablegrant.DatatableGrantManager.createDomainGrant(DatatableGrantManager.java:96)
    at org.secretflow.secretpad.service.impl.DatatableServiceImpl.pushDatatableToTeeNode(DatatableServiceImpl.java:359)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:343)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:196)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
    at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:751)
    at org.springframework.transaction.interceptor.TransactionInterceptor$1.proceedWithInvocation(TransactionInterceptor.java:123)
    at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:391)
    at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:119)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:184)
    at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:751)
    at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:703)
    at org.secretflow.secretpad.service.impl.DatatableServiceImpl$$SpringCGLIB$$0.pushDatatableToTeeNode(<generated>)
    at org.secretflow.secretpad.web.init.TeeResourceInit.initAliceBobDatableToTee(TeeResourceInit.java:148)
    at org.secretflow.secretpad.web.init.TeeResourceInit.run(TeeResourceInit.java:85)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:343)
    at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:699)
    at org.secretflow.secretpad.web.init.TeeResourceInit$$SpringCGLIB$$0.run(<generated>)
    at org.springframework.boot.SpringApplication.lambda$callRunner$5(SpringApplication.java:774)
    at org.springframework.util.function.ThrowingConsumer$1.acceptWithException(ThrowingConsumer.java:83)
    at org.springframework.util.function.ThrowingConsumer.accept(ThrowingConsumer.java:60)
    at org.springframework.util.function.ThrowingConsumer$1.accept(ThrowingConsumer.java:88)
    at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:782)
    at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:773)
    at org.springframework.boot.SpringApplication.lambda$callRunners$3(SpringApplication.java:758)
    at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
    at java.base/java.util.stream.SortedOps$SizedRefSortingSink.end(SortedOps.java:357)
    at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:510)
    at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
    at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
    at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
    at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596)
    at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:758)
    at org.springframework.boot.SpringApplication.run(SpringApplication.java:331)
    at org.springframework.boot.SpringApplication.run(SpringApplication.java:1317)
    at org.springframework.boot.SpringApplication.run(SpringApplication.java:1306)
    at org.secretflow.secretpad.web.SecretPadApplication.main(SecretPadApplication.java:61)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:49)
    at org.springframework.boot.loader.Launcher.launch(Launcher.java:95)
    at org.springframework.boot.loader.Launcher.launch(Launcher.java:58)
    at org.springframework.boot.loader.JarLauncher.main(JarLauncher.java:65)

2024-08-30T11:02:31.311+08:00 ERROR 1 --- [   scheduling-3] o.s.s.s.TaskUtils$LoggingErrorHandler    : Unexpected error occurred in scheduled task

java.lang.InterruptedException: null
    at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1640)
    at java.base/java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:435)
    at org.secretflow.secretpad.persistence.datasync.buffer.center.CenterDataSyncDataBufferTemplate.peek(CenterDataSyncDataBufferTemplate.java:48)
    at org.secretflow.secretpad.service.listener.DbChangeEventListener.sync(DbChangeEventListener.java:110)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:84)
    at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
    at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
    at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
    at java.base/java.lang.Thread.run(Thread.java:840)

2024-08-30T11:02:31.322+08:00  INFO 1 --- [ault-executor-0] o.s.s.k.v.l.ManagedChannelStateListener  : [kuscia] kuscia-system Channel state changed from READY to SHUTDOWN
2024-08-30T11:02:31.327+08:00  INFO 1 --- [           main] j.LocalContainerEntityManagerFactoryBean : Closing JPA EntityManagerFactory for persistence unit 'default'
zimu-yuxi commented 2 weeks ago

kuscia容器里看下有没名为alice-table的domaindata,没有的话可以先创建一个。 请问下kuscia是通过哪个脚本部署的

Meng-xiangkun commented 2 weeks ago

kuscia容器里看下有没名为alice-table的domaindata,没有的话可以先创建一个。 请问下kuscia是通过哪个脚本部署的

https://www.secretflow.org.cn/zh-CN/docs/kuscia/v0.10.0b0/deployment/K8s_deployment_kuscia/K8s_master_lite_cn 根据这个通过k8s部署的,部署的RunP模式。

zimu-yuxi commented 2 weeks ago

可以确认下kuscia容器内是否有alice-table这个domaindata,如果没有可以创建一个,创建完还是相同报错,在org.secretflow.secretpad.manager.integration.datatablegrant.DatatableGrantManager#createDomainGrant,这里断点看一下builder的构造

Meng-xiangkun commented 2 weeks ago

可以确认下kuscia容器内是否有alice-table这个domaindata,如果没有可以创建一个,创建完还是相同报错,在org.secretflow.secretpad.manager.integration.datatablegrant.DatatableGrantManager#createDomainGrant,这里断点看一下builder的构造

Secretpad服务起来了,但是8080web页面打开是空白的,这是怎么回事啊,麻烦给看下 image image

zimu-yuxi commented 1 week ago

1.F12看下前端请求 2.secretpad容器memory调高一些,比如docker update 容器id --memory=8g --memory-swap=8g 3.如果都不行,可以尝试本地起下前端代码,参考这里

Meng-xiangkun commented 1 week ago

1.F12看下前端请求 2.secretpad容器memory调高一些,比如docker update 容器id --memory=8g --memory-swap=8g 3.如果都不行,可以尝试本地起下前端代码,参考这里

image

wangzul commented 1 week ago

可以提供一下pad的配置文件,并且微信群中提示的错误为缺少nodeid F12开发者模式看一下传递的参数。

Meng-xiangkun commented 1 week ago

可以提供一下pad的配置文件,并且微信群中提示的错误为缺少nodeid F12开发者模式看一下传递的参数。

pad的配置文件:

server:
  tomcat:
    accesslog:
      enabled: true
      directory: /var/log/secretpad
  servlet:
    session:
      timeout: 30m
  http-port: 8080
  http-port-inner: 9001
  port: 443
  ssl:
    enabled: true
    key-store: "file:./config/server.jks"
    key-store-password: ${KEY_PASSWORD:secretpad}
    key-alias: secretpad-server
    key-password: ${KEY_PASSWORD:secretpad}
    key-store-type: JKS
  compression:
    enabled: true
    mime-types:
      - application/javascript
      - text/css
    min-response-size: 1024
spring:
  task:
    scheduling:
      pool:
        size: 10
  application:
    name: secretpad
  jpa:
    database-platform: org.hibernate.community.dialect.SQLiteDialect
    show-sql: false
    properties:
      hibernate:
        format_sql: false
    open-in-view: false
  datasource:
    driver-class-name: org.sqlite.JDBC
    url: jdbc:sqlite:./db/secretpad.sqlite
    hikari:
      idle-timeout: 60000
      maximum-pool-size: 1
      connection-timeout: 6000
  flyway:
    baseline-on-migrate: true
    locations:
      - filesystem:./config/schema/center

  #datasource used for mysql
  #spring:
  #  task:
  #    scheduling:
  #      pool:
  #        size: 10
  #  application:
  #    name: secretpad
  #  jpa:
  #    database-platform: org.hibernate.dialect.MySQLDialect
  #    show-sql: false
  #    properties:
  #      hibernate:
  #        format_sql: false
  #  datasource:
  #    driver-class-name: com.mysql.cj.jdbc.Driver
  #    url: your mysql url
  #    username:
  #    password:
  #    hikari:
  #      idle-timeout: 60000
  #      maximum-pool-size: 10
  #      connection-timeout: 5000
  jackson:
    deserialization:
      fail-on-missing-external-type-id-property: false
      fail-on-ignored-properties: false
      fail-on-unknown-properties: false
    serialization:
      fail-on-empty-beans: false
  web:
    locale: zh_CN # default locale, overridden by request "Accept-Language" header.
  cache:
    jcache:
      config:
        classpath:ehcache.xml
springdoc:
  api-docs:
    enabled: true
management:
  endpoints:
    web:
      exposure:
        include: health,info,readiness,prometheus
    enabled-by-default: false
kusciaapi:
  protocol: ${KUSCIA_PROTOCOL:notls}

kuscia:
  nodes:
    - domainId: kuscia-system
      mode: master
      host: ${KUSCIA_API_ADDRESS:kuscia-master.data-develop-operate-dev.svc.cluster.local}
      port: ${KUSCIA_API_PORT:8083}
      protocol: ${KUSCIA_PROTOCOL:notls}
      cert-file: config/certs/client.crt
      key-file: config/certs/client.pem
      token: config/certs/token

    - domainId: alice
      mode: lite
      host: ${KUSCIA_API_LITE_ALICE_ADDRESS:kuscia-lite-alice.data-develop-operate-dev.svc.cluster.local}
      port: ${KUSCIA_API_PORT:8083}
      protocol: ${KUSCIA_PROTOCOL:notls}
      cert-file: config/certs/alice/client.crt
      key-file: config/certs/alice/client.pem
      token: config/certs/alice/token

    - domainId: bob
      mode: lite
      host: ${KUSCIA_API_LITE_BOB_ADDRESS:kuscia-lite-bob.data-develop-operate-dev.svc.cluster.local}
      port: ${KUSCIA_API_PORT:8083}
      protocol: ${KUSCIA_PROTOCOL:notls}
      cert-file: config/certs/bob/client.crt
      key-file: config/certs/bob/client.pem
      token: config/certs/bob/token

job:
  max-parallelism: 1

secretpad:
  logs:
    path: ${SECRETPAD_LOG_PATH:../log}
  deploy-mode: ${DEPLOY_MODE:ALL-IN-ONE} # MPC TEE ALL-IN-ONE
  platform-type: CENTER
  node-id: kuscia-system
  center-platform-service: secretpad.master.svc
  gateway: ${KUSCIA_GW_ADDRESS:127.0.0.1:80}
  auth:
    enabled: true
    pad_name: ${SECRETPAD_USER_NAME}
    pad_pwd: ${SECRETPAD_PASSWORD}
  response:
    extra-headers:
      Content-Security-Policy: "base-uri 'self';frame-src 'self';worker-src blob: 'self' data:;object-src 'self';"
  upload-file:
    max-file-size: -1    # -1 means not limit, e.g.  200MB, 1GB
    max-request-size: -1 # -1 means not limit, e.g.  200MB, 1GB
  data:
    dir-path: /app/data/
  datasync:
    center: true
    p2p: false
  version:
    secretpad-image: ${SECRETPAD_IMAGE:0.5.0b0}
    kuscia-image: ${KUSCIA_IMAGE:0.6.0b0}
    secretflow-image: ${SECRETFLOW_IMAGE:1.4.0b0}
    secretflow-serving-image: ${SECRETFLOW_SERVING_IMAGE:0.2.0b0}
    tee-app-image: ${TEE_APP_IMAGE:0.1.0b0}
    tee-dm-image: ${TEE_DM_IMAGE:0.1.0b0}
    capsule-manager-sim-image: ${CAPSULE_MANAGER_SIM_IMAGE:0.1.2b0}

  component:
    hide:
      - secretflow/io/read_data:0.0.1
      - secretflow/io/write_data:0.0.1
      - secretflow/io/identity:0.0.1
      - secretflow/model/model_export:0.0.1
      - secretflow/ml.train/slnn_train:0.0.1
      - secretflow/ml.predict/slnn_predict:0.0.2

sfclusterDesc:
  deviceConfig:
    spu: "{\"runtime_config\":{\"protocol\":\"SEMI2K\",\"field\":\"FM128\"},\"link_desc\":{\"connect_retry_times\":60,\"connect_retry_interval_ms\":1000,\"brpc_channel_protocol\":\"http\",\"brpc_channel_connection_type\":\"pooled\",\"recv_timeout_ms\":1200000,\"http_timeout_ms\":1200000}}"
    heu: "{\"mode\": \"PHEU\", \"schema\": \"paillier\", \"key_size\": 2048}"
  rayFedConfig:
    crossSiloCommBackend: "brpc_link"

tee:
  capsule-manager: capsule-manager.#.svc

data:
  sync:
    - org.secretflow.secretpad.persistence.entity.ProjectDO
    - org.secretflow.secretpad.persistence.entity.ProjectNodeDO
    - org.secretflow.secretpad.persistence.entity.NodeDO
    - org.secretflow.secretpad.persistence.entity.NodeRouteDO
    - org.secretflow.secretpad.persistence.entity.ProjectJobDO
    - org.secretflow.secretpad.persistence.entity.ProjectTaskDO
    - org.secretflow.secretpad.persistence.entity.ProjectDatatableDO
    - org.secretflow.secretpad.persistence.entity.VoteRequestDO
    - org.secretflow.secretpad.persistence.entity.VoteInviteDO
    - org.secretflow.secretpad.persistence.entity.TeeDownLoadAuditConfigDO
    - org.secretflow.secretpad.persistence.entity.NodeRouteApprovalConfigDO
    - org.secretflow.secretpad.persistence.entity.TeeNodeDatatableManagementDO
    - org.secretflow.secretpad.persistence.entity.ProjectModelServingDO
    - org.secretflow.secretpad.persistence.entity.ProjectGraphNodeKusciaParamsDO
    - org.secretflow.secretpad.persistence.entity.ProjectModelPackDO
    - org.secretflow.secretpad.persistence.entity.FeatureTableDO
    - org.secretflow.secretpad.persistence.entity.ProjectFeatureTableDO
    - org.secretflow.secretpad.persistence.entity.ProjectGraphDomainDatasourceDO

inner-port:
  path:
    - /api/v1alpha1/vote_sync/create
    - /api/v1alpha1/user/node/resetPassword
    - /sync
    - /api/v1alpha1/data/sync
# ip block config (None of them are allowed in the configured IP list)
ip:
  block:
    enable: true
    list:
      - 0.0.0.0/32
      - 127.0.0.1/8
      - 10.0.0.0/8
      - 11.0.0.0/8
      - 30.0.0.0/8
      - 100.64.0.0/10
      - 172.16.0.0/12
      - 192.168.0.0/16
      - 33.0.0.0/8
Meng-xiangkun commented 1 week ago

可以提供一下pad的配置文件,并且微信群中提示的错误为缺少nodeid F12开发者模式看一下传递的参数。

image image image

wangzul commented 1 week ago

Secretpad源码打包镜像----使用的是那个分支或者版本

Meng-xiangkun commented 1 week ago

Secretpad源码打包镜像----使用的是那个分支或者版本

用的是tags v0.9.0b0

wangzul commented 1 week ago

Secretpad源码打包镜像----使用的是那个分支或者版本

Secretpad源码打包镜像----使用的是那个分支或者版本

用的是tags v0.9.0b0

image 你提供的请求参数为 initiatorId目前只有0.10 和main分支使用这个参数0.9以下用的nodeId,你可以检查一下镜像

Meng-xiangkun commented 1 week ago

image image 通讯地址改了不生效,节点不可用

wangzul commented 1 week ago

你尝试修改一下容器内部的数据 /app/db/secretpad.sqlite

Meng-xiangkun commented 1 week ago

你尝试修改一下容器内部的数据 /app/db/secretpad.sqlite

image 修改了数据ip和服务名都试了,还是不可用,怎么排查下不可用是什么问题?

wangzul commented 1 week ago

你尝试修改一下容器内部的数据 /app/db/secretpad.sqlite

image 修改了数据ip和服务名都试了,还是不可用,怎么排查下不可用是什么问题?

  1. 修改过有重启docker容器吗?
  2. 提供一下pad日志 docker logs
Meng-xiangkun commented 1 week ago

你尝试修改一下容器内部的数据 /app/db/secretpad.sqlite

image 修改了数据ip和服务名都试了,还是不可用,怎么排查下不可用是什么问题?

  1. 修改过有重启docker容器吗?
  2. 提供一下pad日志 docker logs

重启了还是不行 pad日志:

`2024-09-11T14:45:06.681+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Response: status {

code: 11404

message: "clusterdomainroutes.kuscia.secretflow \"tee-alice\" not found"

}

2024-09-11T14:45:06.681+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.m.i.noderoute.NodeRouteManager : DomainRoute.RouteStatus response status {

code: 11404

message: "clusterdomainroutes.kuscia.secretflow \"tee-alice\" not found"

}

2024-09-11T14:45:06.686+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.k.v.DynamicKusciaChannelProvider : session UserContextDTO(token=880bbbcbd83b485fb79cd581a9594a99, name=zdsc, platformType=CENTER, platformNodeId=kuscia-system, ownerType=CENTER, ownerId=kuscia-system, projectIds=null, apiResources=null, virtualUserForNode=false, deployMode=ALL-IN-ONE)

2024-09-11T14:45:06.686+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Calling method: kuscia.proto.api.v1alpha1.kusciaapi.DomainService/QueryDomain

2024-09-11T14:45:06.686+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Request: domain_id: "bob"

2024-09-11T14:45:06.697+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Response: status {

message: "success"

}

data {

domain_id: "bob"

cert: "LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURBVENDQWVtZ0F3SUJBZ0lCQVRBTkJna3Foa2lHOXcwQkFRc0ZBREFZTVJZd0ZBWURWUVFERXcxcmRYTmoKYVdFdGMzbHpkR1Z0TUI0WERUY3dNREV3TVRBd01EQXdNRm9YRFRnd01ERXdNVEF3TURBd01Gb3dEakVNTUFvRwpBMVVFQXhNRFltOWlNSUlCSWpBTkJna3Foa2lHOXcwQkFRRUZBQU9DQVE4QU1JSUJDZ0tDQVFFQXlWKzAyT052Ck9SKy8xVE9IYjl3N0hRRlNiRmxUNUtkeHhLN3ZwU3MwWjdXcnRjeld0ZXBjcmsrVUhTWHREdUhpV0tBcTJpQksKK3drWGhBUzA0WDNySWxHQjhtRDVwbEMrMWlaaFg4NnV4eUFFZzB5MkdicCtrajVRamhBWC9LbDBsL1liSTQyaQpOWmV0SENvdDJQbXhFV2k5SHdabmNNTkEzNDFsQVl0RjVDOUswVkFaTkh2SHRHSzN2S1dTQjZ6Mk83ekY3NXJ0CkY2YlkwNms3c05vNm84bzBScWxrdjhnQmlybnpqa0RIeHlwY0VjZ3ZXTDBoTVkxUTVualN5OW5uV1JpMmFnc0kKLzJVUUlIMWJxSVo5Z1V1VE5KNFhmZnVhQ0sxWktLRmN3UUorZkxnTGFWMG5zekFrSEgxRkxmdWFZbHA0MjV4ZAp5eThNUU1pUGtTZGFXUUlEQVFBQm8yQXdYakFPQmdOVkhROEJBZjhFQkFNQ0FvUXdIUVlEVlIwbEJCWXdGQVlJCkt3WUJCUVVIQXdJR0NDc0dBUVVGQndNQk1Bd0dBMVVkRXdFQi93UUNNQUF3SHdZRFZSMGpCQmd3Rm9BVXNneU8KOHRqeThaREpLVU5uYjE3dU00U3c4THd3RFFZSktvWklodmNOQVFFTEJRQURnZ0VCQUFwalRtMS82MDlrYml6MAp6c0NvSDZmK3FLNmdLaldYWFpsdFZPM1Z6aFNnL2RSMVpnL1RuczJqdVpvMWpMVzhyMGtLZ3RYZFF5SnRRT2xSCkdlUlRKQ0x1Um1UYTd0ems2QW5ZUkcrSnhSM05tWUJ5NEg5UTJMM0JTZU90TTl5cFVjUlpjcHhiR0NLL1phdlQKZlpTWHJ6NEFnRW9SN1lwb3lUNFZaYlhXR3gzdmlucUF6dWZsekk0Y0JQOHA3YmQrbTNOZERXUlBmNlJ6UmhSSQpncklPK1M4UGZad2ZWUmJTZXVFYkRHSUppNlV0Mzlid3dOTXFuVllxV1czN3k3ZnVRQXVJVCtIK3ZXMXQwd0lyClFORnBPQnB2WTFyRjRuYmx0YkVaYnJrNk1Zc01Rc0ltQlkrcVRXOURFZ1ZkM2thOGgzUmx0TnM1QXlJa21ZenIKUVdKMzdFRT0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo="

node_statuses {

name: "kuscia-lite-bob-545c476bd7-rbkbr"

status: "Ready"

version: "v0.10.0b0"

last_heartbeat_time: "2024-09-11T06:44:55Z"

last_transition_time: "2024-09-06T06:34:32Z"

}

deploy_token_statuses {

token: "XEzJjnQqFmQB2zSZlTaRAsZFjpvGkqVF"

state: "used"

last_transition_time: "2024-09-06T06:33:27Z"

}

deploy_token_statuses {

token: "Hz3UmnfNp2uAEYlPW2mt2E3EvFZlvuDD"

state: "unused"

last_transition_time: "2024-09-06T06:34:27Z"

}

annotations {

key: "domain/bob"

value: "kuscia.secretflow/domain-type=embedded"

}

annotations {

key: "kubectl.kubernetes.io/last-applied-configuration"

value: "{\"apiVersion\":\"kuscia.secretflow/v1alpha1\",\"kind\":\"Domain\",\"metadata\":{\"annotations\":{\"domain/bob\":\"kuscia.secretflow/domain-type=embedded\"},\"name\":\"bob\"},\"spec\":{\"authCenter\":{\"authenticationType\":\"Token\",\"tokenGenMethod\":\"UID-RSA-GEN\"},\"cert\":null,\"master\":null,\"role\":null}}\n"

}

auth_center {

authentication_type: "Token"

token_gen_method: "UID-RSA-GEN"

}

}

2024-09-11T14:45:06.702+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.k.v.DynamicKusciaChannelProvider : session UserContextDTO(token=880bbbcbd83b485fb79cd581a9594a99, name=zdsc, platformType=CENTER, platformNodeId=kuscia-system, ownerType=CENTER, ownerId=kuscia-system, projectIds=null, apiResources=null, virtualUserForNode=false, deployMode=ALL-IN-ONE)

2024-09-11T14:45:06.702+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Calling method: kuscia.proto.api.v1alpha1.kusciaapi.DomainService/QueryDomain

2024-09-11T14:45:06.703+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Request: domain_id: "alice"

2024-09-11T14:45:06.714+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Response: status {

message: "success"

}

data {

domain_id: "alice"

cert: "LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURBekNDQWV1Z0F3SUJBZ0lCQVRBTkJna3Foa2lHOXcwQkFRc0ZBREFZTVJZd0ZBWURWUVFERXcxcmRYTmoKYVdFdGMzbHpkR1Z0TUI0WERUY3dNREV3TVRBd01EQXdNRm9YRFRnd01ERXdNVEF3TURBd01Gb3dFREVPTUF3RwpBMVVFQXhNRllXeHBZMlV3Z2dFaU1BMEdDU3FHU0liM0RRRUJBUVVBQTRJQkR3QXdnZ0VLQW9JQkFRREpYN1RZCjQyODVINy9WTTRkdjNEc2RBVkpzV1ZQa3AzSEVydStsS3pSbnRhdTF6TmExNmx5dVQ1UWRKZTBPNGVKWW9DcmEKSUVyN0NSZUVCTFRoZmVzaVVZSHlZUG1tVUw3V0ptRmZ6cTdISUFTRFRMWVp1bjZTUGxDT0VCZjhxWFNYOWhzagpqYUkxbDYwY0tpM1krYkVSYUwwZkJtZHd3MERmaldVQmkwWGtMMHJSVUJrMGU4ZTBZcmU4cFpJSHJQWTd2TVh2Cm11MFhwdGpUcVR1dzJqcWp5alJHcVdTL3lBR0t1Zk9PUU1mSEtsd1J5QzlZdlNFeGpWRG1lTkxMMmVkWkdMWnEKQ3dqL1pSQWdmVnVvaG4yQlM1TTBuaGQ5KzVvSXJWa29vVnpCQW41OHVBdHBYU2V6TUNRY2ZVVXQrNXBpV25qYgpuRjNMTHd4QXlJK1JKMXBaQWdNQkFBR2pZREJlTUE0R0ExVWREd0VCL3dRRUF3SUNoREFkQmdOVkhTVUVGakFVCkJnZ3JCZ0VGQlFjREFnWUlLd1lCQlFVSEF3RXdEQVlEVlIwVEFRSC9CQUl3QURBZkJnTlZIU01FR0RBV2dCU3kKREk3eTJQTHhrTWtwUTJkdlh1NHpoTER3dkRBTkJna3Foa2lHOXcwQkFRc0ZBQU9DQVFFQVJxMW1DNm5lZEV1Zgp5cVd5L0J5STgwbDhiMU8vOFg3T3BUdDJ5SXZwUG9WaFdMV3RnSi9BM2JCa2R3L3VmNFczMkJoWlkweVg0ZE9sCjVBVXkvRGtGY3VIeHhpcm9UeEFMc1lNYWpMd0pBdmVUbFlSb080Rm16Z2FXVHVSN1lZUUVQUXVQNWhZRFZEMXcKaTJKYWJ5T2kyMTJMdUJvMVlzcmNhcy9pV0FhTi9jYWNWS010eThCSnV6a0t5dy9WZ1RjVXRIcERPTWdiY3o0MwpQZ21KbDY1bENlRTNjQWhoQ2pTYTV0M1JmWHBxN2VSNjQzT2Y5SzJCT3pRenVvc0ZoS0h2azdTWWV0dldnMTBFCldCc28yYnFZS2luRHlzak1wbkVHQ0RyMC9YaWtnSUFvS3gyeFhJZXRScG50MDIzc3Q4b01KUFd3Uk9Id0J5aGMKRE92aUZvcFVUUT09Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K"

node_statuses {

name: "kuscia-lite-alice-6dd464f48-b5rmm"

status: "Ready"

version: "v0.10.0b0"

last_heartbeat_time: "2024-09-11T06:44:41Z"

last_transition_time: "2024-09-06T06:31:18Z"

}

deploy_token_statuses {

token: "dFMdqgbbpPiAwnuqKwuRZMAA5VJ6hfcv"

state: "used"

last_transition_time: "2024-09-06T06:29:35Z"

}

deploy_token_statuses {

token: "zIUGEgeayul3Shz9rv6pGXcPMIekm9Dr"

state: "unused"

last_transition_time: "2024-09-06T06:31:13Z"

}

annotations {

key: "domain/alice"

value: "kuscia.secretflow/domain-type=embedded"

}

annotations {

key: "kubectl.kubernetes.io/last-applied-configuration"

value: "{\"apiVersion\":\"kuscia.secretflow/v1alpha1\",\"kind\":\"Domain\",\"metadata\":{\"annotations\":{\"domain/alice\":\"kuscia.secretflow/domain-type=embedded\"},\"name\":\"alice\"},\"spec\":{\"authCenter\":{\"authenticationType\":\"Token\",\"tokenGenMethod\":\"UID-RSA-GEN\"},\"cert\":null,\"master\":null,\"role\":null}}\n"

}

auth_center {

authentication_type: "Token"

token_gen_method: "UID-RSA-GEN"

}

}

2024-09-11T14:45:06.715+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.k.v.DynamicKusciaChannelProvider : session UserContextDTO(token=880bbbcbd83b485fb79cd581a9594a99, name=zdsc, platformType=CENTER, platformNodeId=kuscia-system, ownerType=CENTER, ownerId=kuscia-system, projectIds=null, apiResources=null, virtualUserForNode=false, deployMode=ALL-IN-ONE)

2024-09-11T14:45:06.715+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Calling method: kuscia.proto.api.v1alpha1.kusciaapi.DomainRouteService/QueryDomainRoute

2024-09-11T14:45:06.715+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Request: destination: "alice"

source: "bob"

2024-09-11T14:45:06.726+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Response: status {

message: "success"

}

data {

name: "bob-alice"

authentication_type: "Token"

destination: "alice"

endpoint {

host: "10.233.74.148"

ports {

name: "http"

port: 1080

protocol: "HTTP"

5: "/"

}

}

source: "bob"

token_config {

destination_public_key: "LS0tLS1CRUdJTiBSU0EgUFVCTElDIEtFWS0tLS0tCk1JSUJDZ0tDQVFFQXlWKzAyT052T1IrLzFUT0hiOXc3SFFGU2JGbFQ1S2R4eEs3dnBTczBaN1dydGN6V3RlcGMKcmsrVUhTWHREdUhpV0tBcTJpQksrd2tYaEFTMDRYM3JJbEdCOG1ENXBsQysxaVpoWDg2dXh5QUVnMHkyR2JwKwprajVRamhBWC9LbDBsL1liSTQyaU5aZXRIQ290MlBteEVXaTlId1puY01OQTM0MWxBWXRGNUM5SzBWQVpOSHZICnRHSzN2S1dTQjZ6Mk83ekY3NXJ0RjZiWTA2azdzTm82bzhvMFJxbGt2OGdCaXJuemprREh4eXBjRWNndldMMGgKTVkxUTVualN5OW5uV1JpMmFnc0kvMlVRSUgxYnFJWjlnVXVUTko0WGZmdWFDSzFaS0tGY3dRSitmTGdMYVYwbgpzekFrSEgxRkxmdWFZbHA0MjV4ZHl5OE1RTWlQa1NkYVdRSURBUUFCCi0tLS0tRU5EIFJTQSBQVUJMSUMgS0VZLS0tLS0K"

rolling_update_period: 86400

source_public_key: "LS0tLS1CRUdJTiBSU0EgUFVCTElDIEtFWS0tLS0tCk1JSUJDZ0tDQVFFQXlWKzAyT052T1IrLzFUT0hiOXc3SFFGU2JGbFQ1S2R4eEs3dnBTczBaN1dydGN6V3RlcGMKcmsrVUhTWHREdUhpV0tBcTJpQksrd2tYaEFTMDRYM3JJbEdCOG1ENXBsQysxaVpoWDg2dXh5QUVnMHkyR2JwKwprajVRamhBWC9LbDBsL1liSTQyaU5aZXRIQ290MlBteEVXaTlId1puY01OQTM0MWxBWXRGNUM5SzBWQVpOSHZICnRHSzN2S1dTQjZ6Mk83ekY3NXJ0RjZiWTA2azdzTm82bzhvMFJxbGt2OGdCaXJuemprREh4eXBjRWNndldMMGgKTVkxUTVualN5OW5uV1JpMmFnc0kvMlVRSUgxYnFJWjlnVXVUTko0WGZmdWFDSzFaS0tGY3dRSitmTGdMYVYwbgpzekFrSEgxRkxmdWFZbHA0MjV4ZHl5OE1RTWlQa1NkYVdRSURBUUFCCi0tLS0tRU5EIFJTQSBQVUJMSUMgS0VZLS0tLS0K"

token_gen_method: "RSA-GEN"

}

status {

status: "Failed"

}

}

2024-09-11T14:45:06.727+08:00 INFO 1 --- [nio-8080-exec-2] o.s.s.m.i.noderoute.NodeRouteManager : DomainRoute.RouteStatus response status {

message: "success"

}

data {

name: "bob-alice"

authentication_type: "Token"

destination: "alice"

endpoint {

host: "10.233.74.148"

ports {

name: "http"

port: 1080

protocol: "HTTP"

5: "/"

}

}

source: "bob"

token_config {

destination_public_key: "LS0tLS1CRUdJTiBSU0EgUFVCTElDIEtFWS0tLS0tCk1JSUJDZ0tDQVFFQXlWKzAyT052T1IrLzFUT0hiOXc3SFFGU2JGbFQ1S2R4eEs3dnBTczBaN1dydGN6V3RlcGMKcmsrVUhTWHREdUhpV0tBcTJpQksrd2tYaEFTMDRYM3JJbEdCOG1ENXBsQysxaVpoWDg2dXh5QUVnMHkyR2JwKwprajVRamhBWC9LbDBsL1liSTQyaU5aZXRIQ290MlBteEVXaTlId1puY01OQTM0MWxBWXRGNUM5SzBWQVpOSHZICnRHSzN2S1dTQjZ6Mk83ekY3NXJ0RjZiWTA2azdzTm82bzhvMFJxbGt2OGdCaXJuemprREh4eXBjRWNndldMMGgKTVkxUTVualN5OW5uV1JpMmFnc0kvMlVRSUgxYnFJWjlnVXVUTko0WGZmdWFDSzFaS0tGY3dRSitmTGdMYVYwbgpzekFrSEgxRkxmdWFZbHA0MjV4ZHl5OE1RTWlQa1NkYVdRSURBUUFCCi0tLS0tRU5EIFJTQSBQVUJMSUMgS0VZLS0tLS0K"

rolling_update_period: 86400

source_public_key: "LS0tLS1CRUdJTiBSU0EgUFVCTElDIEtFWS0tLS0tCk1JSUJDZ0tDQVFFQXlWKzAyT052T1IrLzFUT0hiOXc3SFFGU2JGbFQ1S2R4eEs3dnBTczBaN1dydGN6V3RlcGMKcmsrVUhTWHREdUhpV0tBcTJpQksrd2tYaEFTMDRYM3JJbEdCOG1ENXBsQysxaVpoWDg2dXh5QUVnMHkyR2JwKwprajVRamhBWC9LbDBsL1liSTQyaU5aZXRIQ290MlBteEVXaTlId1puY01OQTM0MWxBWXRGNUM5SzBWQVpOSHZICnRHSzN2S1dTQjZ6Mk83ekY3NXJ0RjZiWTA2azdzTm82bzhvMFJxbGt2OGdCaXJuemprREh4eXBjRWNndldMMGgKTVkxUTVualN5OW5uV1JpMmFnc0kvMlVRSUgxYnFJWjlnVXVUTko0WGZmdWFDSzFaS0tGY3dRSitmTGdMYVYwbgpzekFrSEgxRkxmdWFZbHA0MjV4ZHl5OE1RTWlQa1NkYVdRSURBUUFCCi0tLS0tRU5EIFJTQSBQVUJMSUMgS0VZLS0tLS0K"

token_gen_method: "RSA-GEN"

}

status {

status: "Failed"

}

}

2024-09-11T14:45:07.369+08:00 INFO 1 --- [ scheduling-1] o.s.s.k.v.DynamicKusciaChannelProvider : session UserContextDTO(token=null, name=null, platformType=null, platformNodeId=null, ownerType=null, ownerId=kuscia-system, projectIds=null, apiResources=null, virtualUserForNode=false, deployMode=null)

2024-09-11T14:45:07.370+08:00 INFO 1 --- [ scheduling-1] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Calling method: kuscia.proto.api.v1alpha1.kusciaapi.DomainRouteService/BatchQueryDomainRouteStatus

2024-09-11T14:45:07.370+08:00 INFO 1 --- [ scheduling-1] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Request:

2024-09-11T14:45:07.373+08:00 INFO 1 --- [ scheduling-1] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Response: status {

code: 11100

message: "DomainRoute keys can not be empty"

}`

image

image

wangzul commented 1 week ago

你尝试修改一下容器内部的数据 /app/db/secretpad.sqlite

image 修改了数据ip和服务名都试了,还是不可用,怎么排查下不可用是什么问题?

进入master 节点 查看路由配置kubectl get cdr 然后用以下格式查看具体的配置 kubectl get cdr alice-bob -oyaml

Meng-xiangkun commented 1 week ago

你尝试修改一下容器内部的数据 /app/db/secretpad.sqlite

image 修改了数据ip和服务名都试了,还是不可用,怎么排查下不可用是什么问题?

进入master 节点 查看路由配置kubectl get cdr 然后用以下格式查看具体的配置 kubectl get cdr alice-bob -oyaml

你尝试修改一下容器内部的数据 /app/db/secretpad.sqlite

image 修改了数据ip和服务名都试了,还是不可用,怎么排查下不可用是什么问题?

进入master 节点 查看路由配置kubectl get cdr 然后用以下格式查看具体的配置 kubectl get cdr alice-bob -oyaml

sh-5.2# kubectl get cdr
NAME                  SOURCE   DESTINATION     HOST            AUTHENTICATION   READY
tee-kuscia-system     tee      kuscia-system                   Token            False
bob-alice             bob      alice           10.233.74.148   Token            False
alice-bob             alice    bob             10.233.37.70    Token            False
bob-kuscia-system     bob      kuscia-system                   Token            True
alice-kuscia-system   alice    kuscia-system                   Token            True
sh-5.2# kubectl get cdr alice-bob -oyaml
apiVersion: kuscia.secretflow/v1alpha1
kind: ClusterDomainRoute
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"kuscia.secretflow/v1alpha1","kind":"ClusterDomainRoute","metadata":{"annotations":{},"name":"alice-bob"},"spec":{"authenticationType":"Token","destination":"bob","endpoint":{"host":"10.233.37.70","ports":[{"isTLS":false,"name":"http","pathPrefix":"/","port":1080,"protocol":"HTTP"}]},"interConnProtocol":"kuscia","requestHeadersToAdd":{"Authorization":"Bearer {{.TOKEN}}"},"source":"alice","tokenConfig":{"rollingUpdatePeriod":86400,"tokenGenMethod":"RSA-GEN"}}}
  creationTimestamp: "2024-09-06T06:36:12Z"
  generation: 4
  labels:
    kuscia.secretflow/clusterdomainroute-destination: bob
    kuscia.secretflow/clusterdomainroute-source: alice
  name: alice-bob
  resourceVersion: "943787"
  uid: 4d17f638-d7d3-44b7-83da-c99998e87b90
spec:
  authenticationType: Token
  destination: bob
  endpoint:
    host: 10.233.37.70
    ports:
    - isTLS: false
      name: http
      pathPrefix: /
      port: 1080
      protocol: HTTP
  interConnProtocol: kuscia
  requestHeadersToAdd:
    Authorization: Bearer {{.TOKEN}}
  source: alice
  tokenConfig:
    destinationPublicKey: LS0tLS1CRUdJTiBSU0EgUFVCTElDIEtFWS0tLS0tCk1JSUJDZ0tDQVFFQXlWKzAyT052T1IrLzFUT0hiOXc3SFFGU2JGbFQ1S2R4eEs3dnBTczBaN1dydGN6V3RlcGMKcmsrVUhTWHREdUhpV0tBcTJpQksrd2tYaEFTMDRYM3JJbEdCOG1ENXBsQysxaVpoWDg2dXh5QUVnMHkyR2JwKwprajVRamhBWC9LbDBsL1liSTQyaU5aZXRIQ290MlBteEVXaTlId1puY01OQTM0MWxBWXRGNUM5SzBWQVpOSHZICnRHSzN2S1dTQjZ6Mk83ekY3NXJ0RjZiWTA2azdzTm82bzhvMFJxbGt2OGdCaXJuemprREh4eXBjRWNndldMMGgKTVkxUTVualN5OW5uV1JpMmFnc0kvMlVRSUgxYnFJWjlnVXVUTko0WGZmdWFDSzFaS0tGY3dRSitmTGdMYVYwbgpzekFrSEgxRkxmdWFZbHA0MjV4ZHl5OE1RTWlQa1NkYVdRSURBUUFCCi0tLS0tRU5EIFJTQSBQVUJMSUMgS0VZLS0tLS0K
    rollingUpdatePeriod: 86400
    sourcePublicKey: LS0tLS1CRUdJTiBSU0EgUFVCTElDIEtFWS0tLS0tCk1JSUJDZ0tDQVFFQXlWKzAyT052T1IrLzFUT0hiOXc3SFFGU2JGbFQ1S2R4eEs3dnBTczBaN1dydGN6V3RlcGMKcmsrVUhTWHREdUhpV0tBcTJpQksrd2tYaEFTMDRYM3JJbEdCOG1ENXBsQysxaVpoWDg2dXh5QUVnMHkyR2JwKwprajVRamhBWC9LbDBsL1liSTQyaU5aZXRIQ290MlBteEVXaTlId1puY01OQTM0MWxBWXRGNUM5SzBWQVpOSHZICnRHSzN2S1dTQjZ6Mk83ekY3NXJ0RjZiWTA2azdzTm82bzhvMFJxbGt2OGdCaXJuemprREh4eXBjRWNndldMMGgKTVkxUTVualN5OW5uV1JpMmFnc0kvMlVRSUgxYnFJWjlnVXVUTko0WGZmdWFDSzFaS0tGY3dRSitmTGdMYVYwbgpzekFrSEgxRkxmdWFZbHA0MjV4ZHl5OE1RTWlQa1NkYVdRSURBUUFCCi0tLS0tRU5EIFJTQSBQVUJMSUMgS0VZLS0tLS0K
    tokenGenMethod: RSA-GEN
status:
  conditions:
  - lastTransitionTime: "2024-09-11T08:46:45Z"
    lastUpdateTime: "2024-09-11T08:46:45Z"
    message: TokenNotGenerate
    reason: DestinationIsNotAuthrized
    status: "False"
    type: Ready
  tokenStatus: {}
wangzul commented 1 week ago
  1. 看一下 alice、bob 的Configmap配置文件
  2. 根据文档重新配置一下路由https://www.secretflow.org.cn/zh-CN/docs/kuscia/v0.10.0b0/deployment/K8s_deployment_kuscia/K8s_master_lite_cn#lite-alicelite-bob

IP用 curl -kvvv http://xxxx:1080/ 返回401 可以在master节点ping lite的dns路由上一条的xxx ,alice ping bob 测试一下是否能够正常通讯

Meng-xiangkun commented 1 week ago
  1. 看一下 alice、bob 的Configmap配置文件
  2. 根据文档重新配置一下路由https://www.secretflow.org.cn/zh-CN/docs/kuscia/v0.10.0b0/deployment/K8s_deployment_kuscia/K8s_master_lite_cn#lite-alicelite-bob

IP用 curl -kvvv http://xxxx:1080/ 返回401 可以在master节点ping lite的dns路由上一条的xxx ,alice ping bob 测试一下是否能够正常通讯

alice的Configmap:

# 启动模式
mode: lite
# 节点ID
# 示例: domainID: alice
domainID: alice
# 节点私钥配置, 用于节点间的通信认证(通过 2 方的证书来生成通讯的身份令牌), 节点应用的证书签发(为了加强通讯安全性,kuscia 会给每一个任务引擎分配 MTLS 证书,不论引擎访问其他模块(包括外部),还是其他模块访问引擎,都走 MTLS 通讯,以免内部攻破引擎。)
# 注意: 目前节点私钥仅支持 pkcs#1 格式的: "BEGIN RSA PRIVATE KEY/END RSA PRIVATE KEY"
# 执行命令 "docker run -it --rm secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/kuscia scripts/deploy/generate_rsa_key.sh" 生成私钥
domainKeyData: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0tCk1JSUV2UUlCQURBTkJna3Foa2lHOXcwQkFRRUZBQVNDQktjd2dnU2pBZ0VBQW9JQkFRREpYN1RZNDI4NUg3L1YKTTRkdjNEc2RBVkpzV1ZQa3AzSEVydStsS3pSbnRhdTF6TmExNmx5dVQ1UWRKZTBPNGVKWW9DcmFJRXI3Q1JlRQpCTFRoZmVzaVVZSHlZUG1tVUw3V0ptRmZ6cTdISUFTRFRMWVp1bjZTUGxDT0VCZjhxWFNYOWhzamphSTFsNjBjCktpM1krYkVSYUwwZkJtZHd3MERmaldVQmkwWGtMMHJSVUJrMGU4ZTBZcmU4cFpJSHJQWTd2TVh2bXUwWHB0alQKcVR1dzJqcWp5alJHcVdTL3lBR0t1Zk9PUU1mSEtsd1J5QzlZdlNFeGpWRG1lTkxMMmVkWkdMWnFDd2ovWlJBZwpmVnVvaG4yQlM1TTBuaGQ5KzVvSXJWa29vVnpCQW41OHVBdHBYU2V6TUNRY2ZVVXQrNXBpV25qYm5GM0xMd3hBCnlJK1JKMXBaQWdNQkFBRUNnZ0VBSDkwVy9xS3VQTG03WHY3eVZVN2h3NnNyNFowWTJ6dHJreFdqTWQxdVEyTEoKc3RDZ3dOUStxZzVKZjNzNjBYb0ltTUZ2Um1pSnRNTXhoMkEvUnRibjE5eFIxWXBtdGx4Y2RnSklzaUpBSVozOQpXTkZRbHkyZFRZS3l1R2Z2ZzdsRWk2OFRpRUtuQWhmbittYnFMa1VFTVo4REhkK2ppb0k2eDZUVjhMS2E4b29KCkx2QWNDWkY5dlEvVHlQYlFBRUF0MGNBOXJFNmxTRExQc3hWTWR5VUtzN2FhYk5mS29RUzdKSEJ1eFVZSkZJcWsKcGUwdGJUK3pOaHBzT2I0LzJYS2VxY0RSdzdudFNBaFV0ck5RZ1diRzV5SG1YQ1JWS1pCQ3NrckMvQjdtME9tQwpsTVRHSUxiU1U2Z2xRY2NUSkZrQVFBV3JkU2FWUjNOK09QTjhXOVZ4YVFLQmdRRHhGMkZCQVN0dHhDa2Q2Q1ArCmgvMzZvNEpWc3h3V3RLU1Z0WFNqYTZ5Zk1WNS9MYXVZdmRsaTZoMVE5QjAwVVdhU0tQYjhNeGgybE94dFNCNTIKbG0vcVBqdGJyY1hHaWJxaVpXcFJ1b0d3a3c5V2JVZDdPQkdvb2pyV29BS2hKVzM4TlFCUlFNYWVaSEFCdzNvUwoyTjVLd0IvbVJXVVB4Nm83SnBPb3JoNlZod0tCZ1FEVjA1TTdzZ1JpRWtEOGFLa05CNEUyVFJTdW9XZ0poRHdVCnFSRk4ycGYxK285TlZDODdoWWNIM0xXak02dHhPdXMxWVgxVXFUSHBhMXp4aWFka2RpRjA3S29FcWh2Y0tNMGUKbkFTWGtGTitiZkdscFhPQ3pKR2JvQlJHT2lzNXoybjJNNWJmTTNuZnpESTJpeEdYUS9wOCszOWN2KzkweFZiQwplaGk2RXFLSkh3S0JnRUw5UGhhejNuOVhmQjFGUFlzaCtsNUVSSmpQZGNTUldSSUlJMnF0Sm4vdFZkWjh1Q3R1CnhSS0kvckJaeEN1ZldxTE9JeUtjaC9XYkY3NmR4V2txRDlyRWcvWExhU0xyYmlKbGo0ODZCWU1zdVp4SUxRNTkKMjlwQmladk5SaTNFbXJUemZTMFdsSm02U3EwU3hiNnE1OGxaYlFPczBKSDc1cjhjenZhVnV3WE5Bb0dBWHVBawo2UXpnNHY4RWRMcWZuOWRmbnM5dXlObDNSeG0wYXRwbGdpem0xazdadk04SXNobGFROFBMbUdGNXhhRUY4a2FTCmpMa1NHMmIyODNsSG04ektwWTNKRm83QUU5ekt2clV0V0c3Q2pVdU5PQm1FZWxuNGxadmV3eFpXVGExWmI5T08KTXZVdE0zN3dITUZ5Q2JNdzlybkUxa3VYblRGZWdLWWFTSjJ5SHJNQ2dZRUF1U2wyeWZ0UWwxUStESjRBV0JIOQpmSElvMGJ6SzFwZkt6Rzl5RHluRkFtS1c5aTNvYVBHZjlYQW5NVFhhaW9iem1sdy9zWWozTmpoeUlVT3p6VDVJCmVmT1d5NWMvRmNERDZweXFGRFhnSUNkSjg2TmwyajFmU0RaaXpvNCtMVXJXNnBMSHNrTVk0L0dJeGwyRWpGYjAKVFhscHZMYlBSOFExUHdvOWR1elRvWFU9Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K

# 日志级别 INFO、DEBUG、WARN
logLevel: INFO

# master
# 节点连接 master 的部署 Token, 用于节点向 master 注册证书, 只在节点第一次向 master 注册证书时有效
liteDeployToken: dFMdqgbbpPiAwnuqKwuRZMAA5VJ6hfcv
# 节点连接 master 的地址
# 示例: http://kuscia-master.kuscia-master.svc.cluster.local:1080
masterEndpoint: http://kuscia-master.data-develop-operate-dev.svc.cluster.local:1080

# runc or runk
runtime: runp

# 节点可用于调度应用的容量, runc 不填会自动获取当前容器的系统资源, runk 模式下需要手动配置
capacity:
  cpu: 4
  memory: 4Gi
  pods: 500
  storage: 100Gi

# KusciaAPI 以及节点对外网关使用的通信协议, NOTLS/TLS/MTLS
protocol: NOTLS

# agent 镜像配置, 使用私有仓库存储镜像时配置(默认无需配置)
image:
  pullPolicy: #使用镜像仓库|使用本地
  defaultRegistry: ""
  registries:
    - name: ""
      endpoint: ""
      username: ""
      password: ""

bob的Configmap:

# 启动模式
mode: lite
# 节点ID
# 示例: domainID: bob
domainID: bob
# 节点私钥配置, 用于节点间的通信认证(通过 2 方的证书来生成通讯的身份令牌), 节点应用的证书签发(为了加强通讯安全性,kuscia 会给每一个任务引擎分配 MTLS 证书,不论引擎访问其他模块(包括外部),还是其他模块访问引擎,都走 MTLS 通讯,以免内部攻破引擎。)
# 注意: 目前节点私钥仅支持 pkcs#1 格式的: "BEGIN RSA PRIVATE KEY/END RSA PRIVATE KEY"
# 执行命令 "docker run -it --rm secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/kuscia scripts/deploy/generate_rsa_key.sh" 生成私钥
domainKeyData: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0tCk1JSUV2UUlCQURBTkJna3Foa2lHOXcwQkFRRUZBQVNDQktjd2dnU2pBZ0VBQW9JQkFRREpYN1RZNDI4NUg3L1YKTTRkdjNEc2RBVkpzV1ZQa3AzSEVydStsS3pSbnRhdTF6TmExNmx5dVQ1UWRKZTBPNGVKWW9DcmFJRXI3Q1JlRQpCTFRoZmVzaVVZSHlZUG1tVUw3V0ptRmZ6cTdISUFTRFRMWVp1bjZTUGxDT0VCZjhxWFNYOWhzamphSTFsNjBjCktpM1krYkVSYUwwZkJtZHd3MERmaldVQmkwWGtMMHJSVUJrMGU4ZTBZcmU4cFpJSHJQWTd2TVh2bXUwWHB0alQKcVR1dzJqcWp5alJHcVdTL3lBR0t1Zk9PUU1mSEtsd1J5QzlZdlNFeGpWRG1lTkxMMmVkWkdMWnFDd2ovWlJBZwpmVnVvaG4yQlM1TTBuaGQ5KzVvSXJWa29vVnpCQW41OHVBdHBYU2V6TUNRY2ZVVXQrNXBpV25qYm5GM0xMd3hBCnlJK1JKMXBaQWdNQkFBRUNnZ0VBSDkwVy9xS3VQTG03WHY3eVZVN2h3NnNyNFowWTJ6dHJreFdqTWQxdVEyTEoKc3RDZ3dOUStxZzVKZjNzNjBYb0ltTUZ2Um1pSnRNTXhoMkEvUnRibjE5eFIxWXBtdGx4Y2RnSklzaUpBSVozOQpXTkZRbHkyZFRZS3l1R2Z2ZzdsRWk2OFRpRUtuQWhmbittYnFMa1VFTVo4REhkK2ppb0k2eDZUVjhMS2E4b29KCkx2QWNDWkY5dlEvVHlQYlFBRUF0MGNBOXJFNmxTRExQc3hWTWR5VUtzN2FhYk5mS29RUzdKSEJ1eFVZSkZJcWsKcGUwdGJUK3pOaHBzT2I0LzJYS2VxY0RSdzdudFNBaFV0ck5RZ1diRzV5SG1YQ1JWS1pCQ3NrckMvQjdtME9tQwpsTVRHSUxiU1U2Z2xRY2NUSkZrQVFBV3JkU2FWUjNOK09QTjhXOVZ4YVFLQmdRRHhGMkZCQVN0dHhDa2Q2Q1ArCmgvMzZvNEpWc3h3V3RLU1Z0WFNqYTZ5Zk1WNS9MYXVZdmRsaTZoMVE5QjAwVVdhU0tQYjhNeGgybE94dFNCNTIKbG0vcVBqdGJyY1hHaWJxaVpXcFJ1b0d3a3c5V2JVZDdPQkdvb2pyV29BS2hKVzM4TlFCUlFNYWVaSEFCdzNvUwoyTjVLd0IvbVJXVVB4Nm83SnBPb3JoNlZod0tCZ1FEVjA1TTdzZ1JpRWtEOGFLa05CNEUyVFJTdW9XZ0poRHdVCnFSRk4ycGYxK285TlZDODdoWWNIM0xXak02dHhPdXMxWVgxVXFUSHBhMXp4aWFka2RpRjA3S29FcWh2Y0tNMGUKbkFTWGtGTitiZkdscFhPQ3pKR2JvQlJHT2lzNXoybjJNNWJmTTNuZnpESTJpeEdYUS9wOCszOWN2KzkweFZiQwplaGk2RXFLSkh3S0JnRUw5UGhhejNuOVhmQjFGUFlzaCtsNUVSSmpQZGNTUldSSUlJMnF0Sm4vdFZkWjh1Q3R1CnhSS0kvckJaeEN1ZldxTE9JeUtjaC9XYkY3NmR4V2txRDlyRWcvWExhU0xyYmlKbGo0ODZCWU1zdVp4SUxRNTkKMjlwQmladk5SaTNFbXJUemZTMFdsSm02U3EwU3hiNnE1OGxaYlFPczBKSDc1cjhjenZhVnV3WE5Bb0dBWHVBawo2UXpnNHY4RWRMcWZuOWRmbnM5dXlObDNSeG0wYXRwbGdpem0xazdadk04SXNobGFROFBMbUdGNXhhRUY4a2FTCmpMa1NHMmIyODNsSG04ektwWTNKRm83QUU5ekt2clV0V0c3Q2pVdU5PQm1FZWxuNGxadmV3eFpXVGExWmI5T08KTXZVdE0zN3dITUZ5Q2JNdzlybkUxa3VYblRGZWdLWWFTSjJ5SHJNQ2dZRUF1U2wyeWZ0UWwxUStESjRBV0JIOQpmSElvMGJ6SzFwZkt6Rzl5RHluRkFtS1c5aTNvYVBHZjlYQW5NVFhhaW9iem1sdy9zWWozTmpoeUlVT3p6VDVJCmVmT1d5NWMvRmNERDZweXFGRFhnSUNkSjg2TmwyajFmU0RaaXpvNCtMVXJXNnBMSHNrTVk0L0dJeGwyRWpGYjAKVFhscHZMYlBSOFExUHdvOWR1elRvWFU9Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K

# 日志级别 INFO、DEBUG、WARN
logLevel: INFO

# master
# 节点连接 master 的部署 Token, 用于节点向 master 注册证书, 只在节点第一次向 master 注册证书时有效
liteDeployToken: XEzJjnQqFmQB2zSZlTaRAsZFjpvGkqVF
# 节点连接 master 的地址
# 示例: http://kuscia-master.kuscia-master.svc.cluster.local:1080
masterEndpoint: http://kuscia-master.data-develop-operate-dev.svc.cluster.local:1080

# runc or runk
runtime: runp

# 节点可用于调度应用的容量, runc 不填会自动获取当前容器的系统资源, runk 模式下需要手动配置
capacity:
  cpu: 4
  memory: 4Gi
  pods: 500
  storage: 100Gi

# KusciaAPI 以及节点对外网关使用的通信协议, NOTLS/TLS/MTLS
protocol: NOTLS

# agent 镜像配置, 使用私有仓库存储镜像时配置(默认无需配置)
image:
  pullPolicy: #使用镜像仓库|使用本地
  defaultRegistry: ""
  registries:
    - name: ""
      endpoint: ""
      username: ""
      password: ""
Meng-xiangkun commented 1 week ago
  1. 看一下 alice、bob 的Configmap配置文件
  2. 根据文档重新配置一下路由https://www.secretflow.org.cn/zh-CN/docs/kuscia/v0.10.0b0/deployment/K8s_deployment_kuscia/K8s_master_lite_cn#lite-alicelite-bob

IP用 curl -kvvv http://xxxx:1080/ 返回401 可以在master节点ping lite的dns路由上一条的xxx ,alice ping bob 测试一下是否能够正常通讯

alice的Configmap:

# 启动模式
mode: lite
# 节点ID
# 示例: domainID: alice
domainID: alice
# 节点私钥配置, 用于节点间的通信认证(通过 2 方的证书来生成通讯的身份令牌), 节点应用的证书签发(为了加强通讯安全性,kuscia 会给每一个任务引擎分配 MTLS 证书,不论引擎访问其他模块(包括外部),还是其他模块访问引擎,都走 MTLS 通讯,以免内部攻破引擎。)
# 注意: 目前节点私钥仅支持 pkcs#1 格式的: "BEGIN RSA PRIVATE KEY/END RSA PRIVATE KEY"
# 执行命令 "docker run -it --rm secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/kuscia scripts/deploy/generate_rsa_key.sh" 生成私钥
domainKeyData: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0tCk1JSUV2UUlCQURBTkJna3Foa2lHOXcwQkFRRUZBQVNDQktjd2dnU2pBZ0VBQW9JQkFRREpYN1RZNDI4NUg3L1YKTTRkdjNEc2RBVkpzV1ZQa3AzSEVydStsS3pSbnRhdTF6TmExNmx5dVQ1UWRKZTBPNGVKWW9DcmFJRXI3Q1JlRQpCTFRoZmVzaVVZSHlZUG1tVUw3V0ptRmZ6cTdISUFTRFRMWVp1bjZTUGxDT0VCZjhxWFNYOWhzamphSTFsNjBjCktpM1krYkVSYUwwZkJtZHd3MERmaldVQmkwWGtMMHJSVUJrMGU4ZTBZcmU4cFpJSHJQWTd2TVh2bXUwWHB0alQKcVR1dzJqcWp5alJHcVdTL3lBR0t1Zk9PUU1mSEtsd1J5QzlZdlNFeGpWRG1lTkxMMmVkWkdMWnFDd2ovWlJBZwpmVnVvaG4yQlM1TTBuaGQ5KzVvSXJWa29vVnpCQW41OHVBdHBYU2V6TUNRY2ZVVXQrNXBpV25qYm5GM0xMd3hBCnlJK1JKMXBaQWdNQkFBRUNnZ0VBSDkwVy9xS3VQTG03WHY3eVZVN2h3NnNyNFowWTJ6dHJreFdqTWQxdVEyTEoKc3RDZ3dOUStxZzVKZjNzNjBYb0ltTUZ2Um1pSnRNTXhoMkEvUnRibjE5eFIxWXBtdGx4Y2RnSklzaUpBSVozOQpXTkZRbHkyZFRZS3l1R2Z2ZzdsRWk2OFRpRUtuQWhmbittYnFMa1VFTVo4REhkK2ppb0k2eDZUVjhMS2E4b29KCkx2QWNDWkY5dlEvVHlQYlFBRUF0MGNBOXJFNmxTRExQc3hWTWR5VUtzN2FhYk5mS29RUzdKSEJ1eFVZSkZJcWsKcGUwdGJUK3pOaHBzT2I0LzJYS2VxY0RSdzdudFNBaFV0ck5RZ1diRzV5SG1YQ1JWS1pCQ3NrckMvQjdtME9tQwpsTVRHSUxiU1U2Z2xRY2NUSkZrQVFBV3JkU2FWUjNOK09QTjhXOVZ4YVFLQmdRRHhGMkZCQVN0dHhDa2Q2Q1ArCmgvMzZvNEpWc3h3V3RLU1Z0WFNqYTZ5Zk1WNS9MYXVZdmRsaTZoMVE5QjAwVVdhU0tQYjhNeGgybE94dFNCNTIKbG0vcVBqdGJyY1hHaWJxaVpXcFJ1b0d3a3c5V2JVZDdPQkdvb2pyV29BS2hKVzM4TlFCUlFNYWVaSEFCdzNvUwoyTjVLd0IvbVJXVVB4Nm83SnBPb3JoNlZod0tCZ1FEVjA1TTdzZ1JpRWtEOGFLa05CNEUyVFJTdW9XZ0poRHdVCnFSRk4ycGYxK285TlZDODdoWWNIM0xXak02dHhPdXMxWVgxVXFUSHBhMXp4aWFka2RpRjA3S29FcWh2Y0tNMGUKbkFTWGtGTitiZkdscFhPQ3pKR2JvQlJHT2lzNXoybjJNNWJmTTNuZnpESTJpeEdYUS9wOCszOWN2KzkweFZiQwplaGk2RXFLSkh3S0JnRUw5UGhhejNuOVhmQjFGUFlzaCtsNUVSSmpQZGNTUldSSUlJMnF0Sm4vdFZkWjh1Q3R1CnhSS0kvckJaeEN1ZldxTE9JeUtjaC9XYkY3NmR4V2txRDlyRWcvWExhU0xyYmlKbGo0ODZCWU1zdVp4SUxRNTkKMjlwQmladk5SaTNFbXJUemZTMFdsSm02U3EwU3hiNnE1OGxaYlFPczBKSDc1cjhjenZhVnV3WE5Bb0dBWHVBawo2UXpnNHY4RWRMcWZuOWRmbnM5dXlObDNSeG0wYXRwbGdpem0xazdadk04SXNobGFROFBMbUdGNXhhRUY4a2FTCmpMa1NHMmIyODNsSG04ektwWTNKRm83QUU5ekt2clV0V0c3Q2pVdU5PQm1FZWxuNGxadmV3eFpXVGExWmI5T08KTXZVdE0zN3dITUZ5Q2JNdzlybkUxa3VYblRGZWdLWWFTSjJ5SHJNQ2dZRUF1U2wyeWZ0UWwxUStESjRBV0JIOQpmSElvMGJ6SzFwZkt6Rzl5RHluRkFtS1c5aTNvYVBHZjlYQW5NVFhhaW9iem1sdy9zWWozTmpoeUlVT3p6VDVJCmVmT1d5NWMvRmNERDZweXFGRFhnSUNkSjg2TmwyajFmU0RaaXpvNCtMVXJXNnBMSHNrTVk0L0dJeGwyRWpGYjAKVFhscHZMYlBSOFExUHdvOWR1elRvWFU9Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K

# 日志级别 INFO、DEBUG、WARN
logLevel: INFO

# master
# 节点连接 master 的部署 Token, 用于节点向 master 注册证书, 只在节点第一次向 master 注册证书时有效
liteDeployToken: dFMdqgbbpPiAwnuqKwuRZMAA5VJ6hfcv
# 节点连接 master 的地址
# 示例: http://kuscia-master.kuscia-master.svc.cluster.local:1080
masterEndpoint: http://kuscia-master.data-develop-operate-dev.svc.cluster.local:1080

# runc or runk
runtime: runp

# 节点可用于调度应用的容量, runc 不填会自动获取当前容器的系统资源, runk 模式下需要手动配置
capacity:
  cpu: 4
  memory: 4Gi
  pods: 500
  storage: 100Gi

# KusciaAPI 以及节点对外网关使用的通信协议, NOTLS/TLS/MTLS
protocol: NOTLS

# agent 镜像配置, 使用私有仓库存储镜像时配置(默认无需配置)
image:
  pullPolicy: #使用镜像仓库|使用本地
  defaultRegistry: ""
  registries:
    - name: ""
      endpoint: ""
      username: ""
      password: ""

bob的Configmap:

# 启动模式
mode: lite
# 节点ID
# 示例: domainID: bob
domainID: bob
# 节点私钥配置, 用于节点间的通信认证(通过 2 方的证书来生成通讯的身份令牌), 节点应用的证书签发(为了加强通讯安全性,kuscia 会给每一个任务引擎分配 MTLS 证书,不论引擎访问其他模块(包括外部),还是其他模块访问引擎,都走 MTLS 通讯,以免内部攻破引擎。)
# 注意: 目前节点私钥仅支持 pkcs#1 格式的: "BEGIN RSA PRIVATE KEY/END RSA PRIVATE KEY"
# 执行命令 "docker run -it --rm secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/kuscia scripts/deploy/generate_rsa_key.sh" 生成私钥
domainKeyData: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0tCk1JSUV2UUlCQURBTkJna3Foa2lHOXcwQkFRRUZBQVNDQktjd2dnU2pBZ0VBQW9JQkFRREpYN1RZNDI4NUg3L1YKTTRkdjNEc2RBVkpzV1ZQa3AzSEVydStsS3pSbnRhdTF6TmExNmx5dVQ1UWRKZTBPNGVKWW9DcmFJRXI3Q1JlRQpCTFRoZmVzaVVZSHlZUG1tVUw3V0ptRmZ6cTdISUFTRFRMWVp1bjZTUGxDT0VCZjhxWFNYOWhzamphSTFsNjBjCktpM1krYkVSYUwwZkJtZHd3MERmaldVQmkwWGtMMHJSVUJrMGU4ZTBZcmU4cFpJSHJQWTd2TVh2bXUwWHB0alQKcVR1dzJqcWp5alJHcVdTL3lBR0t1Zk9PUU1mSEtsd1J5QzlZdlNFeGpWRG1lTkxMMmVkWkdMWnFDd2ovWlJBZwpmVnVvaG4yQlM1TTBuaGQ5KzVvSXJWa29vVnpCQW41OHVBdHBYU2V6TUNRY2ZVVXQrNXBpV25qYm5GM0xMd3hBCnlJK1JKMXBaQWdNQkFBRUNnZ0VBSDkwVy9xS3VQTG03WHY3eVZVN2h3NnNyNFowWTJ6dHJreFdqTWQxdVEyTEoKc3RDZ3dOUStxZzVKZjNzNjBYb0ltTUZ2Um1pSnRNTXhoMkEvUnRibjE5eFIxWXBtdGx4Y2RnSklzaUpBSVozOQpXTkZRbHkyZFRZS3l1R2Z2ZzdsRWk2OFRpRUtuQWhmbittYnFMa1VFTVo4REhkK2ppb0k2eDZUVjhMS2E4b29KCkx2QWNDWkY5dlEvVHlQYlFBRUF0MGNBOXJFNmxTRExQc3hWTWR5VUtzN2FhYk5mS29RUzdKSEJ1eFVZSkZJcWsKcGUwdGJUK3pOaHBzT2I0LzJYS2VxY0RSdzdudFNBaFV0ck5RZ1diRzV5SG1YQ1JWS1pCQ3NrckMvQjdtME9tQwpsTVRHSUxiU1U2Z2xRY2NUSkZrQVFBV3JkU2FWUjNOK09QTjhXOVZ4YVFLQmdRRHhGMkZCQVN0dHhDa2Q2Q1ArCmgvMzZvNEpWc3h3V3RLU1Z0WFNqYTZ5Zk1WNS9MYXVZdmRsaTZoMVE5QjAwVVdhU0tQYjhNeGgybE94dFNCNTIKbG0vcVBqdGJyY1hHaWJxaVpXcFJ1b0d3a3c5V2JVZDdPQkdvb2pyV29BS2hKVzM4TlFCUlFNYWVaSEFCdzNvUwoyTjVLd0IvbVJXVVB4Nm83SnBPb3JoNlZod0tCZ1FEVjA1TTdzZ1JpRWtEOGFLa05CNEUyVFJTdW9XZ0poRHdVCnFSRk4ycGYxK285TlZDODdoWWNIM0xXak02dHhPdXMxWVgxVXFUSHBhMXp4aWFka2RpRjA3S29FcWh2Y0tNMGUKbkFTWGtGTitiZkdscFhPQ3pKR2JvQlJHT2lzNXoybjJNNWJmTTNuZnpESTJpeEdYUS9wOCszOWN2KzkweFZiQwplaGk2RXFLSkh3S0JnRUw5UGhhejNuOVhmQjFGUFlzaCtsNUVSSmpQZGNTUldSSUlJMnF0Sm4vdFZkWjh1Q3R1CnhSS0kvckJaeEN1ZldxTE9JeUtjaC9XYkY3NmR4V2txRDlyRWcvWExhU0xyYmlKbGo0ODZCWU1zdVp4SUxRNTkKMjlwQmladk5SaTNFbXJUemZTMFdsSm02U3EwU3hiNnE1OGxaYlFPczBKSDc1cjhjenZhVnV3WE5Bb0dBWHVBawo2UXpnNHY4RWRMcWZuOWRmbnM5dXlObDNSeG0wYXRwbGdpem0xazdadk04SXNobGFROFBMbUdGNXhhRUY4a2FTCmpMa1NHMmIyODNsSG04ektwWTNKRm83QUU5ekt2clV0V0c3Q2pVdU5PQm1FZWxuNGxadmV3eFpXVGExWmI5T08KTXZVdE0zN3dITUZ5Q2JNdzlybkUxa3VYblRGZWdLWWFTSjJ5SHJNQ2dZRUF1U2wyeWZ0UWwxUStESjRBV0JIOQpmSElvMGJ6SzFwZkt6Rzl5RHluRkFtS1c5aTNvYVBHZjlYQW5NVFhhaW9iem1sdy9zWWozTmpoeUlVT3p6VDVJCmVmT1d5NWMvRmNERDZweXFGRFhnSUNkSjg2TmwyajFmU0RaaXpvNCtMVXJXNnBMSHNrTVk0L0dJeGwyRWpGYjAKVFhscHZMYlBSOFExUHdvOWR1elRvWFU9Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K

# 日志级别 INFO、DEBUG、WARN
logLevel: INFO

# master
# 节点连接 master 的部署 Token, 用于节点向 master 注册证书, 只在节点第一次向 master 注册证书时有效
liteDeployToken: XEzJjnQqFmQB2zSZlTaRAsZFjpvGkqVF
# 节点连接 master 的地址
# 示例: http://kuscia-master.kuscia-master.svc.cluster.local:1080
masterEndpoint: http://kuscia-master.data-develop-operate-dev.svc.cluster.local:1080

# runc or runk
runtime: runp

# 节点可用于调度应用的容量, runc 不填会自动获取当前容器的系统资源, runk 模式下需要手动配置
capacity:
  cpu: 4
  memory: 4Gi
  pods: 500
  storage: 100Gi

# KusciaAPI 以及节点对外网关使用的通信协议, NOTLS/TLS/MTLS
protocol: NOTLS

# agent 镜像配置, 使用私有仓库存储镜像时配置(默认无需配置)
image:
  pullPolicy: #使用镜像仓库|使用本地
  defaultRegistry: ""
  registries:
    - name: ""
      endpoint: ""
      username: ""
      password: ""

image image 通讯也都是正常的

wangzul commented 1 week ago
  1. 看一下 alice、bob 的Configmap配置文件
  2. 根据文档重新配置一下路由https://www.secretflow.org.cn/zh-CN/docs/kuscia/v0.10.0b0/deployment/K8s_deployment_kuscia/K8s_master_lite_cn#lite-alicelite-bob

IP用 curl -kvvv http://xxxx:1080/ 返回401 可以在master节点ping lite的dns路由上一条的xxx ,alice ping bob 测试一下是否能够正常通讯

alice的Configmap:

# 启动模式
mode: lite
# 节点ID
# 示例: domainID: alice
domainID: alice
# 节点私钥配置, 用于节点间的通信认证(通过 2 方的证书来生成通讯的身份令牌), 节点应用的证书签发(为了加强通讯安全性,kuscia 会给每一个任务引擎分配 MTLS 证书,不论引擎访问其他模块(包括外部),还是其他模块访问引擎,都走 MTLS 通讯,以免内部攻破引擎。)
# 注意: 目前节点私钥仅支持 pkcs#1 格式的: "BEGIN RSA PRIVATE KEY/END RSA PRIVATE KEY"
# 执行命令 "docker run -it --rm secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/kuscia scripts/deploy/generate_rsa_key.sh" 生成私钥
domainKeyData: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0tCk1JSUV2UUlCQURBTkJna3Foa2lHOXcwQkFRRUZBQVNDQktjd2dnU2pBZ0VBQW9JQkFRREpYN1RZNDI4NUg3L1YKTTRkdjNEc2RBVkpzV1ZQa3AzSEVydStsS3pSbnRhdTF6TmExNmx5dVQ1UWRKZTBPNGVKWW9DcmFJRXI3Q1JlRQpCTFRoZmVzaVVZSHlZUG1tVUw3V0ptRmZ6cTdISUFTRFRMWVp1bjZTUGxDT0VCZjhxWFNYOWhzamphSTFsNjBjCktpM1krYkVSYUwwZkJtZHd3MERmaldVQmkwWGtMMHJSVUJrMGU4ZTBZcmU4cFpJSHJQWTd2TVh2bXUwWHB0alQKcVR1dzJqcWp5alJHcVdTL3lBR0t1Zk9PUU1mSEtsd1J5QzlZdlNFeGpWRG1lTkxMMmVkWkdMWnFDd2ovWlJBZwpmVnVvaG4yQlM1TTBuaGQ5KzVvSXJWa29vVnpCQW41OHVBdHBYU2V6TUNRY2ZVVXQrNXBpV25qYm5GM0xMd3hBCnlJK1JKMXBaQWdNQkFBRUNnZ0VBSDkwVy9xS3VQTG03WHY3eVZVN2h3NnNyNFowWTJ6dHJreFdqTWQxdVEyTEoKc3RDZ3dOUStxZzVKZjNzNjBYb0ltTUZ2Um1pSnRNTXhoMkEvUnRibjE5eFIxWXBtdGx4Y2RnSklzaUpBSVozOQpXTkZRbHkyZFRZS3l1R2Z2ZzdsRWk2OFRpRUtuQWhmbittYnFMa1VFTVo4REhkK2ppb0k2eDZUVjhMS2E4b29KCkx2QWNDWkY5dlEvVHlQYlFBRUF0MGNBOXJFNmxTRExQc3hWTWR5VUtzN2FhYk5mS29RUzdKSEJ1eFVZSkZJcWsKcGUwdGJUK3pOaHBzT2I0LzJYS2VxY0RSdzdudFNBaFV0ck5RZ1diRzV5SG1YQ1JWS1pCQ3NrckMvQjdtME9tQwpsTVRHSUxiU1U2Z2xRY2NUSkZrQVFBV3JkU2FWUjNOK09QTjhXOVZ4YVFLQmdRRHhGMkZCQVN0dHhDa2Q2Q1ArCmgvMzZvNEpWc3h3V3RLU1Z0WFNqYTZ5Zk1WNS9MYXVZdmRsaTZoMVE5QjAwVVdhU0tQYjhNeGgybE94dFNCNTIKbG0vcVBqdGJyY1hHaWJxaVpXcFJ1b0d3a3c5V2JVZDdPQkdvb2pyV29BS2hKVzM4TlFCUlFNYWVaSEFCdzNvUwoyTjVLd0IvbVJXVVB4Nm83SnBPb3JoNlZod0tCZ1FEVjA1TTdzZ1JpRWtEOGFLa05CNEUyVFJTdW9XZ0poRHdVCnFSRk4ycGYxK285TlZDODdoWWNIM0xXak02dHhPdXMxWVgxVXFUSHBhMXp4aWFka2RpRjA3S29FcWh2Y0tNMGUKbkFTWGtGTitiZkdscFhPQ3pKR2JvQlJHT2lzNXoybjJNNWJmTTNuZnpESTJpeEdYUS9wOCszOWN2KzkweFZiQwplaGk2RXFLSkh3S0JnRUw5UGhhejNuOVhmQjFGUFlzaCtsNUVSSmpQZGNTUldSSUlJMnF0Sm4vdFZkWjh1Q3R1CnhSS0kvckJaeEN1ZldxTE9JeUtjaC9XYkY3NmR4V2txRDlyRWcvWExhU0xyYmlKbGo0ODZCWU1zdVp4SUxRNTkKMjlwQmladk5SaTNFbXJUemZTMFdsSm02U3EwU3hiNnE1OGxaYlFPczBKSDc1cjhjenZhVnV3WE5Bb0dBWHVBawo2UXpnNHY4RWRMcWZuOWRmbnM5dXlObDNSeG0wYXRwbGdpem0xazdadk04SXNobGFROFBMbUdGNXhhRUY4a2FTCmpMa1NHMmIyODNsSG04ektwWTNKRm83QUU5ekt2clV0V0c3Q2pVdU5PQm1FZWxuNGxadmV3eFpXVGExWmI5T08KTXZVdE0zN3dITUZ5Q2JNdzlybkUxa3VYblRGZWdLWWFTSjJ5SHJNQ2dZRUF1U2wyeWZ0UWwxUStESjRBV0JIOQpmSElvMGJ6SzFwZkt6Rzl5RHluRkFtS1c5aTNvYVBHZjlYQW5NVFhhaW9iem1sdy9zWWozTmpoeUlVT3p6VDVJCmVmT1d5NWMvRmNERDZweXFGRFhnSUNkSjg2TmwyajFmU0RaaXpvNCtMVXJXNnBMSHNrTVk0L0dJeGwyRWpGYjAKVFhscHZMYlBSOFExUHdvOWR1elRvWFU9Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K

# 日志级别 INFO、DEBUG、WARN
logLevel: INFO

# master
# 节点连接 master 的部署 Token, 用于节点向 master 注册证书, 只在节点第一次向 master 注册证书时有效
liteDeployToken: dFMdqgbbpPiAwnuqKwuRZMAA5VJ6hfcv
# 节点连接 master 的地址
# 示例: http://kuscia-master.kuscia-master.svc.cluster.local:1080
masterEndpoint: http://kuscia-master.data-develop-operate-dev.svc.cluster.local:1080

# runc or runk
runtime: runp

# 节点可用于调度应用的容量, runc 不填会自动获取当前容器的系统资源, runk 模式下需要手动配置
capacity:
  cpu: 4
  memory: 4Gi
  pods: 500
  storage: 100Gi

# KusciaAPI 以及节点对外网关使用的通信协议, NOTLS/TLS/MTLS
protocol: NOTLS

# agent 镜像配置, 使用私有仓库存储镜像时配置(默认无需配置)
image:
  pullPolicy: #使用镜像仓库|使用本地
  defaultRegistry: ""
  registries:
    - name: ""
      endpoint: ""
      username: ""
      password: ""

bob的Configmap:

# 启动模式
mode: lite
# 节点ID
# 示例: domainID: bob
domainID: bob
# 节点私钥配置, 用于节点间的通信认证(通过 2 方的证书来生成通讯的身份令牌), 节点应用的证书签发(为了加强通讯安全性,kuscia 会给每一个任务引擎分配 MTLS 证书,不论引擎访问其他模块(包括外部),还是其他模块访问引擎,都走 MTLS 通讯,以免内部攻破引擎。)
# 注意: 目前节点私钥仅支持 pkcs#1 格式的: "BEGIN RSA PRIVATE KEY/END RSA PRIVATE KEY"
# 执行命令 "docker run -it --rm secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/kuscia scripts/deploy/generate_rsa_key.sh" 生成私钥
domainKeyData: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0tCk1JSUV2UUlCQURBTkJna3Foa2lHOXcwQkFRRUZBQVNDQktjd2dnU2pBZ0VBQW9JQkFRREpYN1RZNDI4NUg3L1YKTTRkdjNEc2RBVkpzV1ZQa3AzSEVydStsS3pSbnRhdTF6TmExNmx5dVQ1UWRKZTBPNGVKWW9DcmFJRXI3Q1JlRQpCTFRoZmVzaVVZSHlZUG1tVUw3V0ptRmZ6cTdISUFTRFRMWVp1bjZTUGxDT0VCZjhxWFNYOWhzamphSTFsNjBjCktpM1krYkVSYUwwZkJtZHd3MERmaldVQmkwWGtMMHJSVUJrMGU4ZTBZcmU4cFpJSHJQWTd2TVh2bXUwWHB0alQKcVR1dzJqcWp5alJHcVdTL3lBR0t1Zk9PUU1mSEtsd1J5QzlZdlNFeGpWRG1lTkxMMmVkWkdMWnFDd2ovWlJBZwpmVnVvaG4yQlM1TTBuaGQ5KzVvSXJWa29vVnpCQW41OHVBdHBYU2V6TUNRY2ZVVXQrNXBpV25qYm5GM0xMd3hBCnlJK1JKMXBaQWdNQkFBRUNnZ0VBSDkwVy9xS3VQTG03WHY3eVZVN2h3NnNyNFowWTJ6dHJreFdqTWQxdVEyTEoKc3RDZ3dOUStxZzVKZjNzNjBYb0ltTUZ2Um1pSnRNTXhoMkEvUnRibjE5eFIxWXBtdGx4Y2RnSklzaUpBSVozOQpXTkZRbHkyZFRZS3l1R2Z2ZzdsRWk2OFRpRUtuQWhmbittYnFMa1VFTVo4REhkK2ppb0k2eDZUVjhMS2E4b29KCkx2QWNDWkY5dlEvVHlQYlFBRUF0MGNBOXJFNmxTRExQc3hWTWR5VUtzN2FhYk5mS29RUzdKSEJ1eFVZSkZJcWsKcGUwdGJUK3pOaHBzT2I0LzJYS2VxY0RSdzdudFNBaFV0ck5RZ1diRzV5SG1YQ1JWS1pCQ3NrckMvQjdtME9tQwpsTVRHSUxiU1U2Z2xRY2NUSkZrQVFBV3JkU2FWUjNOK09QTjhXOVZ4YVFLQmdRRHhGMkZCQVN0dHhDa2Q2Q1ArCmgvMzZvNEpWc3h3V3RLU1Z0WFNqYTZ5Zk1WNS9MYXVZdmRsaTZoMVE5QjAwVVdhU0tQYjhNeGgybE94dFNCNTIKbG0vcVBqdGJyY1hHaWJxaVpXcFJ1b0d3a3c5V2JVZDdPQkdvb2pyV29BS2hKVzM4TlFCUlFNYWVaSEFCdzNvUwoyTjVLd0IvbVJXVVB4Nm83SnBPb3JoNlZod0tCZ1FEVjA1TTdzZ1JpRWtEOGFLa05CNEUyVFJTdW9XZ0poRHdVCnFSRk4ycGYxK285TlZDODdoWWNIM0xXak02dHhPdXMxWVgxVXFUSHBhMXp4aWFka2RpRjA3S29FcWh2Y0tNMGUKbkFTWGtGTitiZkdscFhPQ3pKR2JvQlJHT2lzNXoybjJNNWJmTTNuZnpESTJpeEdYUS9wOCszOWN2KzkweFZiQwplaGk2RXFLSkh3S0JnRUw5UGhhejNuOVhmQjFGUFlzaCtsNUVSSmpQZGNTUldSSUlJMnF0Sm4vdFZkWjh1Q3R1CnhSS0kvckJaeEN1ZldxTE9JeUtjaC9XYkY3NmR4V2txRDlyRWcvWExhU0xyYmlKbGo0ODZCWU1zdVp4SUxRNTkKMjlwQmladk5SaTNFbXJUemZTMFdsSm02U3EwU3hiNnE1OGxaYlFPczBKSDc1cjhjenZhVnV3WE5Bb0dBWHVBawo2UXpnNHY4RWRMcWZuOWRmbnM5dXlObDNSeG0wYXRwbGdpem0xazdadk04SXNobGFROFBMbUdGNXhhRUY4a2FTCmpMa1NHMmIyODNsSG04ektwWTNKRm83QUU5ekt2clV0V0c3Q2pVdU5PQm1FZWxuNGxadmV3eFpXVGExWmI5T08KTXZVdE0zN3dITUZ5Q2JNdzlybkUxa3VYblRGZWdLWWFTSjJ5SHJNQ2dZRUF1U2wyeWZ0UWwxUStESjRBV0JIOQpmSElvMGJ6SzFwZkt6Rzl5RHluRkFtS1c5aTNvYVBHZjlYQW5NVFhhaW9iem1sdy9zWWozTmpoeUlVT3p6VDVJCmVmT1d5NWMvRmNERDZweXFGRFhnSUNkSjg2TmwyajFmU0RaaXpvNCtMVXJXNnBMSHNrTVk0L0dJeGwyRWpGYjAKVFhscHZMYlBSOFExUHdvOWR1elRvWFU9Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K

# 日志级别 INFO、DEBUG、WARN
logLevel: INFO

# master
# 节点连接 master 的部署 Token, 用于节点向 master 注册证书, 只在节点第一次向 master 注册证书时有效
liteDeployToken: XEzJjnQqFmQB2zSZlTaRAsZFjpvGkqVF
# 节点连接 master 的地址
# 示例: http://kuscia-master.kuscia-master.svc.cluster.local:1080
masterEndpoint: http://kuscia-master.data-develop-operate-dev.svc.cluster.local:1080

# runc or runk
runtime: runp

# 节点可用于调度应用的容量, runc 不填会自动获取当前容器的系统资源, runk 模式下需要手动配置
capacity:
  cpu: 4
  memory: 4Gi
  pods: 500
  storage: 100Gi

# KusciaAPI 以及节点对外网关使用的通信协议, NOTLS/TLS/MTLS
protocol: NOTLS

# agent 镜像配置, 使用私有仓库存储镜像时配置(默认无需配置)
image:
  pullPolicy: #使用镜像仓库|使用本地
  defaultRegistry: ""
  registries:
    - name: ""
      endpoint: ""
      username: ""
      password: ""

image image 通讯也都是正常的

我看你的路由配置使用的是具体IP 10.233.74.148
curl -kvvv http://10.233.74.148:1080 验证一下这个不过我推荐你按照文档重新配置一下路由授权

Meng-xiangkun commented 1 week ago
  1. 看一下 alice、bob 的Configmap配置文件
  2. 根据文档重新配置一下路由https://www.secretflow.org.cn/zh-CN/docs/kuscia/v0.10.0b0/deployment/K8s_deployment_kuscia/K8s_master_lite_cn#lite-alicelite-bob

IP用 curl -kvvv http://xxxx:1080/ 返回401 可以在master节点ping lite的dns路由上一条的xxx ,alice ping bob 测试一下是否能够正常通讯

alice的Configmap:

# 启动模式
mode: lite
# 节点ID
# 示例: domainID: alice
domainID: alice
# 节点私钥配置, 用于节点间的通信认证(通过 2 方的证书来生成通讯的身份令牌), 节点应用的证书签发(为了加强通讯安全性,kuscia 会给每一个任务引擎分配 MTLS 证书,不论引擎访问其他模块(包括外部),还是其他模块访问引擎,都走 MTLS 通讯,以免内部攻破引擎。)
# 注意: 目前节点私钥仅支持 pkcs#1 格式的: "BEGIN RSA PRIVATE KEY/END RSA PRIVATE KEY"
# 执行命令 "docker run -it --rm secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/kuscia scripts/deploy/generate_rsa_key.sh" 生成私钥
domainKeyData: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0tCk1JSUV2UUlCQURBTkJna3Foa2lHOXcwQkFRRUZBQVNDQktjd2dnU2pBZ0VBQW9JQkFRREpYN1RZNDI4NUg3L1YKTTRkdjNEc2RBVkpzV1ZQa3AzSEVydStsS3pSbnRhdTF6TmExNmx5dVQ1UWRKZTBPNGVKWW9DcmFJRXI3Q1JlRQpCTFRoZmVzaVVZSHlZUG1tVUw3V0ptRmZ6cTdISUFTRFRMWVp1bjZTUGxDT0VCZjhxWFNYOWhzamphSTFsNjBjCktpM1krYkVSYUwwZkJtZHd3MERmaldVQmkwWGtMMHJSVUJrMGU4ZTBZcmU4cFpJSHJQWTd2TVh2bXUwWHB0alQKcVR1dzJqcWp5alJHcVdTL3lBR0t1Zk9PUU1mSEtsd1J5QzlZdlNFeGpWRG1lTkxMMmVkWkdMWnFDd2ovWlJBZwpmVnVvaG4yQlM1TTBuaGQ5KzVvSXJWa29vVnpCQW41OHVBdHBYU2V6TUNRY2ZVVXQrNXBpV25qYm5GM0xMd3hBCnlJK1JKMXBaQWdNQkFBRUNnZ0VBSDkwVy9xS3VQTG03WHY3eVZVN2h3NnNyNFowWTJ6dHJreFdqTWQxdVEyTEoKc3RDZ3dOUStxZzVKZjNzNjBYb0ltTUZ2Um1pSnRNTXhoMkEvUnRibjE5eFIxWXBtdGx4Y2RnSklzaUpBSVozOQpXTkZRbHkyZFRZS3l1R2Z2ZzdsRWk2OFRpRUtuQWhmbittYnFMa1VFTVo4REhkK2ppb0k2eDZUVjhMS2E4b29KCkx2QWNDWkY5dlEvVHlQYlFBRUF0MGNBOXJFNmxTRExQc3hWTWR5VUtzN2FhYk5mS29RUzdKSEJ1eFVZSkZJcWsKcGUwdGJUK3pOaHBzT2I0LzJYS2VxY0RSdzdudFNBaFV0ck5RZ1diRzV5SG1YQ1JWS1pCQ3NrckMvQjdtME9tQwpsTVRHSUxiU1U2Z2xRY2NUSkZrQVFBV3JkU2FWUjNOK09QTjhXOVZ4YVFLQmdRRHhGMkZCQVN0dHhDa2Q2Q1ArCmgvMzZvNEpWc3h3V3RLU1Z0WFNqYTZ5Zk1WNS9MYXVZdmRsaTZoMVE5QjAwVVdhU0tQYjhNeGgybE94dFNCNTIKbG0vcVBqdGJyY1hHaWJxaVpXcFJ1b0d3a3c5V2JVZDdPQkdvb2pyV29BS2hKVzM4TlFCUlFNYWVaSEFCdzNvUwoyTjVLd0IvbVJXVVB4Nm83SnBPb3JoNlZod0tCZ1FEVjA1TTdzZ1JpRWtEOGFLa05CNEUyVFJTdW9XZ0poRHdVCnFSRk4ycGYxK285TlZDODdoWWNIM0xXak02dHhPdXMxWVgxVXFUSHBhMXp4aWFka2RpRjA3S29FcWh2Y0tNMGUKbkFTWGtGTitiZkdscFhPQ3pKR2JvQlJHT2lzNXoybjJNNWJmTTNuZnpESTJpeEdYUS9wOCszOWN2KzkweFZiQwplaGk2RXFLSkh3S0JnRUw5UGhhejNuOVhmQjFGUFlzaCtsNUVSSmpQZGNTUldSSUlJMnF0Sm4vdFZkWjh1Q3R1CnhSS0kvckJaeEN1ZldxTE9JeUtjaC9XYkY3NmR4V2txRDlyRWcvWExhU0xyYmlKbGo0ODZCWU1zdVp4SUxRNTkKMjlwQmladk5SaTNFbXJUemZTMFdsSm02U3EwU3hiNnE1OGxaYlFPczBKSDc1cjhjenZhVnV3WE5Bb0dBWHVBawo2UXpnNHY4RWRMcWZuOWRmbnM5dXlObDNSeG0wYXRwbGdpem0xazdadk04SXNobGFROFBMbUdGNXhhRUY4a2FTCmpMa1NHMmIyODNsSG04ektwWTNKRm83QUU5ekt2clV0V0c3Q2pVdU5PQm1FZWxuNGxadmV3eFpXVGExWmI5T08KTXZVdE0zN3dITUZ5Q2JNdzlybkUxa3VYblRGZWdLWWFTSjJ5SHJNQ2dZRUF1U2wyeWZ0UWwxUStESjRBV0JIOQpmSElvMGJ6SzFwZkt6Rzl5RHluRkFtS1c5aTNvYVBHZjlYQW5NVFhhaW9iem1sdy9zWWozTmpoeUlVT3p6VDVJCmVmT1d5NWMvRmNERDZweXFGRFhnSUNkSjg2TmwyajFmU0RaaXpvNCtMVXJXNnBMSHNrTVk0L0dJeGwyRWpGYjAKVFhscHZMYlBSOFExUHdvOWR1elRvWFU9Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K

# 日志级别 INFO、DEBUG、WARN
logLevel: INFO

# master
# 节点连接 master 的部署 Token, 用于节点向 master 注册证书, 只在节点第一次向 master 注册证书时有效
liteDeployToken: dFMdqgbbpPiAwnuqKwuRZMAA5VJ6hfcv
# 节点连接 master 的地址
# 示例: http://kuscia-master.kuscia-master.svc.cluster.local:1080
masterEndpoint: http://kuscia-master.data-develop-operate-dev.svc.cluster.local:1080

# runc or runk
runtime: runp

# 节点可用于调度应用的容量, runc 不填会自动获取当前容器的系统资源, runk 模式下需要手动配置
capacity:
  cpu: 4
  memory: 4Gi
  pods: 500
  storage: 100Gi

# KusciaAPI 以及节点对外网关使用的通信协议, NOTLS/TLS/MTLS
protocol: NOTLS

# agent 镜像配置, 使用私有仓库存储镜像时配置(默认无需配置)
image:
  pullPolicy: #使用镜像仓库|使用本地
  defaultRegistry: ""
  registries:
    - name: ""
      endpoint: ""
      username: ""
      password: ""

bob的Configmap:

# 启动模式
mode: lite
# 节点ID
# 示例: domainID: bob
domainID: bob
# 节点私钥配置, 用于节点间的通信认证(通过 2 方的证书来生成通讯的身份令牌), 节点应用的证书签发(为了加强通讯安全性,kuscia 会给每一个任务引擎分配 MTLS 证书,不论引擎访问其他模块(包括外部),还是其他模块访问引擎,都走 MTLS 通讯,以免内部攻破引擎。)
# 注意: 目前节点私钥仅支持 pkcs#1 格式的: "BEGIN RSA PRIVATE KEY/END RSA PRIVATE KEY"
# 执行命令 "docker run -it --rm secretflow-registry.cn-hangzhou.cr.aliyuncs.com/secretflow/kuscia scripts/deploy/generate_rsa_key.sh" 生成私钥
domainKeyData: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0tCk1JSUV2UUlCQURBTkJna3Foa2lHOXcwQkFRRUZBQVNDQktjd2dnU2pBZ0VBQW9JQkFRREpYN1RZNDI4NUg3L1YKTTRkdjNEc2RBVkpzV1ZQa3AzSEVydStsS3pSbnRhdTF6TmExNmx5dVQ1UWRKZTBPNGVKWW9DcmFJRXI3Q1JlRQpCTFRoZmVzaVVZSHlZUG1tVUw3V0ptRmZ6cTdISUFTRFRMWVp1bjZTUGxDT0VCZjhxWFNYOWhzamphSTFsNjBjCktpM1krYkVSYUwwZkJtZHd3MERmaldVQmkwWGtMMHJSVUJrMGU4ZTBZcmU4cFpJSHJQWTd2TVh2bXUwWHB0alQKcVR1dzJqcWp5alJHcVdTL3lBR0t1Zk9PUU1mSEtsd1J5QzlZdlNFeGpWRG1lTkxMMmVkWkdMWnFDd2ovWlJBZwpmVnVvaG4yQlM1TTBuaGQ5KzVvSXJWa29vVnpCQW41OHVBdHBYU2V6TUNRY2ZVVXQrNXBpV25qYm5GM0xMd3hBCnlJK1JKMXBaQWdNQkFBRUNnZ0VBSDkwVy9xS3VQTG03WHY3eVZVN2h3NnNyNFowWTJ6dHJreFdqTWQxdVEyTEoKc3RDZ3dOUStxZzVKZjNzNjBYb0ltTUZ2Um1pSnRNTXhoMkEvUnRibjE5eFIxWXBtdGx4Y2RnSklzaUpBSVozOQpXTkZRbHkyZFRZS3l1R2Z2ZzdsRWk2OFRpRUtuQWhmbittYnFMa1VFTVo4REhkK2ppb0k2eDZUVjhMS2E4b29KCkx2QWNDWkY5dlEvVHlQYlFBRUF0MGNBOXJFNmxTRExQc3hWTWR5VUtzN2FhYk5mS29RUzdKSEJ1eFVZSkZJcWsKcGUwdGJUK3pOaHBzT2I0LzJYS2VxY0RSdzdudFNBaFV0ck5RZ1diRzV5SG1YQ1JWS1pCQ3NrckMvQjdtME9tQwpsTVRHSUxiU1U2Z2xRY2NUSkZrQVFBV3JkU2FWUjNOK09QTjhXOVZ4YVFLQmdRRHhGMkZCQVN0dHhDa2Q2Q1ArCmgvMzZvNEpWc3h3V3RLU1Z0WFNqYTZ5Zk1WNS9MYXVZdmRsaTZoMVE5QjAwVVdhU0tQYjhNeGgybE94dFNCNTIKbG0vcVBqdGJyY1hHaWJxaVpXcFJ1b0d3a3c5V2JVZDdPQkdvb2pyV29BS2hKVzM4TlFCUlFNYWVaSEFCdzNvUwoyTjVLd0IvbVJXVVB4Nm83SnBPb3JoNlZod0tCZ1FEVjA1TTdzZ1JpRWtEOGFLa05CNEUyVFJTdW9XZ0poRHdVCnFSRk4ycGYxK285TlZDODdoWWNIM0xXak02dHhPdXMxWVgxVXFUSHBhMXp4aWFka2RpRjA3S29FcWh2Y0tNMGUKbkFTWGtGTitiZkdscFhPQ3pKR2JvQlJHT2lzNXoybjJNNWJmTTNuZnpESTJpeEdYUS9wOCszOWN2KzkweFZiQwplaGk2RXFLSkh3S0JnRUw5UGhhejNuOVhmQjFGUFlzaCtsNUVSSmpQZGNTUldSSUlJMnF0Sm4vdFZkWjh1Q3R1CnhSS0kvckJaeEN1ZldxTE9JeUtjaC9XYkY3NmR4V2txRDlyRWcvWExhU0xyYmlKbGo0ODZCWU1zdVp4SUxRNTkKMjlwQmladk5SaTNFbXJUemZTMFdsSm02U3EwU3hiNnE1OGxaYlFPczBKSDc1cjhjenZhVnV3WE5Bb0dBWHVBawo2UXpnNHY4RWRMcWZuOWRmbnM5dXlObDNSeG0wYXRwbGdpem0xazdadk04SXNobGFROFBMbUdGNXhhRUY4a2FTCmpMa1NHMmIyODNsSG04ektwWTNKRm83QUU5ekt2clV0V0c3Q2pVdU5PQm1FZWxuNGxadmV3eFpXVGExWmI5T08KTXZVdE0zN3dITUZ5Q2JNdzlybkUxa3VYblRGZWdLWWFTSjJ5SHJNQ2dZRUF1U2wyeWZ0UWwxUStESjRBV0JIOQpmSElvMGJ6SzFwZkt6Rzl5RHluRkFtS1c5aTNvYVBHZjlYQW5NVFhhaW9iem1sdy9zWWozTmpoeUlVT3p6VDVJCmVmT1d5NWMvRmNERDZweXFGRFhnSUNkSjg2TmwyajFmU0RaaXpvNCtMVXJXNnBMSHNrTVk0L0dJeGwyRWpGYjAKVFhscHZMYlBSOFExUHdvOWR1elRvWFU9Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K

# 日志级别 INFO、DEBUG、WARN
logLevel: INFO

# master
# 节点连接 master 的部署 Token, 用于节点向 master 注册证书, 只在节点第一次向 master 注册证书时有效
liteDeployToken: XEzJjnQqFmQB2zSZlTaRAsZFjpvGkqVF
# 节点连接 master 的地址
# 示例: http://kuscia-master.kuscia-master.svc.cluster.local:1080
masterEndpoint: http://kuscia-master.data-develop-operate-dev.svc.cluster.local:1080

# runc or runk
runtime: runp

# 节点可用于调度应用的容量, runc 不填会自动获取当前容器的系统资源, runk 模式下需要手动配置
capacity:
  cpu: 4
  memory: 4Gi
  pods: 500
  storage: 100Gi

# KusciaAPI 以及节点对外网关使用的通信协议, NOTLS/TLS/MTLS
protocol: NOTLS

# agent 镜像配置, 使用私有仓库存储镜像时配置(默认无需配置)
image:
  pullPolicy: #使用镜像仓库|使用本地
  defaultRegistry: ""
  registries:
    - name: ""
      endpoint: ""
      username: ""
      password: ""

image image 通讯也都是正常的

我看你的路由配置使用的是具体IP 10.233.74.148 curl -kvvv http://10.233.74.148:1080 验证一下这个不过我推荐你按照文档重新配置一下路由授权

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: each job status

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: kuscia status task_id: "emzj-ubryppxk-node-3"

state: "Pending"

create_time: "2024-09-12T06:34:28Z"

alias: "emzj-ubryppxk-node-3"

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: kuscia status emzj-ubryppxk-node-3 INITIALIZED task_id: "emzj-ubryppxk-node-3"

state: "Pending"

create_time: "2024-09-12T06:34:28Z"

alias: "emzj-ubryppxk-node-3"

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: sync result ProjectTaskDO(upk=ProjectTaskDO.UPK(projectId=irnyogit, jobId=emzj, taskId=emzj-ubryppxk-node-3), parties=[bob, alice], status=INITIALIZED, errMsg=, graphNodeId=ubryppxk-node-3, graphNode=ProjectGraphNodeDO(upk=ProjectGraphNodeDO.UPK(projectId=irnyogit, graphId=ubryppxk, graphNodeId=ubryppxk-node-3), codeName=data_prep/psi, label=隐私求交, x=-260, y=-100, inputs=[ubryppxk-node-1-output-0, ubryppxk-node-2-output-0], outputs=[ubryppxk-node-3-output-0], nodeDef={attrPaths=[input/receiver_input/key, input/sender_input/key, protocol, sort_result, allow_duplicate_keys, allow_duplicate_keys/no/skip_duplicates_check, fill_value_int, ecdh_curve], attrs=[{is_na=false, ss=[id1]}, {is_na=false, ss=[id2]}, {is_na=false, s=PROTOCOL_RR22}, {b=true, is_na=false}, {is_na=false, s=no}, {is_na=true}, {is_na=true}, {is_na=false, s=CURVE_FOURQ}], domain=data_prep, name=psi, version=0.0.5}))

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: kuscia status task_id: "emzj-ubryppxk-node-4"

state: "Pending"

alias: "emzj-ubryppxk-node-4"

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: kuscia status emzj-ubryppxk-node-4 INITIALIZED task_id: "emzj-ubryppxk-node-4"

state: "Pending"

alias: "emzj-ubryppxk-node-4"

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: sync result ProjectTaskDO(upk=ProjectTaskDO.UPK(projectId=irnyogit, jobId=emzj, taskId=emzj-ubryppxk-node-4), parties=[bob, alice], status=INITIALIZED, errMsg=, graphNodeId=ubryppxk-node-4, graphNode=ProjectGraphNodeDO(upk=ProjectGraphNodeDO.UPK(projectId=irnyogit, graphId=ubryppxk, graphNodeId=ubryppxk-node-4), codeName=stats/table_statistics, label=全表统计, x=-260, y=20, inputs=[ubryppxk-node-3-output-0], outputs=[ubryppxk-node-4-output-0], nodeDef={attrPaths=[input/input_data/features], attrs=[{is_na=false, ss=[contact_cellular]}], domain=stats, name=table_statistics, version=0.0.2}))

2024-09-12T14:34:28.170+08:00 INFO 1 --- [lt-executor-190] o.s.s.s.l.JobTaskLogEventListener : *** JobTaskLogEventListener emzj-ubryppxk-node-3 INITIALIZED INITIALIZED

2024-09-12T14:34:28.170+08:00 INFO 1 --- [lt-executor-190] o.s.s.s.l.JobTaskLogEventListener : *** JobTaskLogEventListener emzj-ubryppxk-node-4 INITIALIZED INITIALIZED

2024-09-12T14:34:28.270+08:00 INFO 1 --- [lt-executor-190] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Response: type: MODIFIED

object {

job_id: "emzj"

status {

state: "Failed"

create_time: "2024-09-12T06:34:27Z"

start_time: "2024-09-12T06:34:28Z"

tasks {

task_id: "emzj-ubryppxk-node-3"

state: "Failed"

err_msg: "KusciaTask failed after 3x retry, last error: failed to build domain bob kit info, failed to get appImage \"secretflow-image\" from cache, appimage.kuscia.secretflow \"secretflow-image\" not found"

create_time: "2024-09-12T06:34:28Z"

start_time: "2024-09-12T06:34:28Z"

end_time: "2024-09-12T06:34:28Z"

alias: "emzj-ubryppxk-node-3"

}

tasks {

task_id: "emzj-ubryppxk-node-4"

state: "Pending"

alias: "emzj-ubryppxk-node-4"

}

stage_status_list {

domain_id: "alice"

state: "JobCreateStageSucceeded"

}

stage_status_list {

domain_id: "bob"

state: "JobCreateStageSucceeded"

}

approve_status_list {

domain_id: "alice"

state: "JobAccepted"

}

approve_status_list {

domain_id: "bob"

state: "JobAccepted"

}

}

}

2024-09-12T14:34:28.271+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : starter jobEvent ... type: MODIFIED

object {

job_id: "emzj"

status {

state: "Failed"

create_time: "2024-09-12T06:34:27Z"

start_time: "2024-09-12T06:34:28Z"

tasks {

task_id: "emzj-ubryppxk-node-3"

state: "Failed"

err_msg: "KusciaTask failed after 3x retry, last error: failed to build domain bob kit info, failed to get appImage \"secretflow-image\" from cache, appimage.kuscia.secretflow \"secretflow-image\" not found"

create_time: "2024-09-12T06:34:28Z"

start_time: "2024-09-12T06:34:28Z"

end_time: "2024-09-12T06:34:28Z"

alias: "emzj-ubryppxk-node-3"

}

tasks {

task_id: "emzj-ubryppxk-node-4"

state: "Pending"

alias: "emzj-ubryppxk-node-4"

}

stage_status_list {

domain_id: "alice"

state: "JobCreateStageSucceeded"

}

stage_status_list {

domain_id: "bob"

state: "JobCreateStageSucceeded"

}

approve_status_list {

domain_id: "alice"

state: "JobAccepted"

}

approve_status_list {

domain_id: "bob"

state: "JobAccepted"

}

}

}

2024-09-12T14:34:28.271+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: jobId=emzj, jobState=Failed, task=[taskId=emzj-ubryppxk-node-3,alias=emzj-ubryppxk-node-3,state=Failed|taskId=emzj-ubryppxk-node-4,alias=emzj-ubryppxk-node-4,state=Pending], endTime=

2024-09-12T14:34:28.282+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: update job: it={

"type": "MODIFIED",

"object": {

"job_id": "emzj",

"status": {

"state": "Failed",

"create_time": "2024-09-12T06:34:27Z",

"start_time": "2024-09-12T06:34:28Z",

"tasks": [{

"task_id": "emzj-ubryppxk-node-3",

"state": "Failed",

"err_msg": "KusciaTask failed after 3x retry, last error: failed to build domain bob kit info, failed to get appImage \"secretflow-image\" from cache, appimage.kuscia.secretflow \"secretflow-image\" not found",

"create_time": "2024-09-12T06:34:28Z",

"start_time": "2024-09-12T06:34:28Z",

"end_time": "2024-09-12T06:34:28Z",

"alias": "emzj-ubryppxk-node-3"

}, {

"task_id": "emzj-ubryppxk-node-4",

"state": "Pending",

"alias": "emzj-ubryppxk-node-4"

}],

"stage_status_list": [{

"domain_id": "alice",

"state": "JobCreateStageSucceeded"

}, {

"domain_id": "bob",

"state": "JobCreateStageSucceeded"

}],

"approve_status_list": [{

"domain_id": "alice",

"state": "JobAccepted"

}, {

"domain_id": "bob",

"state": "JobAccepted"

}]

}

}

}

请问怎么把secretflow集成进来啊

wangzul commented 1 week ago
2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: each job status

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: kuscia status task_id: "emzj-ubryppxk-node-3"

state: "Pending"

create_time: "2024-09-12T06:34:28Z"

alias: "emzj-ubryppxk-node-3"

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: kuscia status emzj-ubryppxk-node-3 INITIALIZED task_id: "emzj-ubryppxk-node-3"

state: "Pending"

create_time: "2024-09-12T06:34:28Z"

alias: "emzj-ubryppxk-node-3"

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: sync result ProjectTaskDO(upk=ProjectTaskDO.UPK(projectId=irnyogit, jobId=emzj, taskId=emzj-ubryppxk-node-3), parties=[bob, alice], status=INITIALIZED, errMsg=, graphNodeId=ubryppxk-node-3, graphNode=ProjectGraphNodeDO(upk=ProjectGraphNodeDO.UPK(projectId=irnyogit, graphId=ubryppxk, graphNodeId=ubryppxk-node-3), codeName=data_prep/psi, label=隐私求交, x=-260, y=-100, inputs=[ubryppxk-node-1-output-0, ubryppxk-node-2-output-0], outputs=[ubryppxk-node-3-output-0], nodeDef={attrPaths=[input/receiver_input/key, input/sender_input/key, protocol, sort_result, allow_duplicate_keys, allow_duplicate_keys/no/skip_duplicates_check, fill_value_int, ecdh_curve], attrs=[{is_na=false, ss=[id1]}, {is_na=false, ss=[id2]}, {is_na=false, s=PROTOCOL_RR22}, {b=true, is_na=false}, {is_na=false, s=no}, {is_na=true}, {is_na=true}, {is_na=false, s=CURVE_FOURQ}], domain=data_prep, name=psi, version=0.0.5}))

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: kuscia status task_id: "emzj-ubryppxk-node-4"

state: "Pending"

alias: "emzj-ubryppxk-node-4"

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: kuscia status emzj-ubryppxk-node-4 INITIALIZED task_id: "emzj-ubryppxk-node-4"

state: "Pending"

alias: "emzj-ubryppxk-node-4"

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: sync result ProjectTaskDO(upk=ProjectTaskDO.UPK(projectId=irnyogit, jobId=emzj, taskId=emzj-ubryppxk-node-4), parties=[bob, alice], status=INITIALIZED, errMsg=, graphNodeId=ubryppxk-node-4, graphNode=ProjectGraphNodeDO(upk=ProjectGraphNodeDO.UPK(projectId=irnyogit, graphId=ubryppxk, graphNodeId=ubryppxk-node-4), codeName=stats/table_statistics, label=全表统计, x=-260, y=20, inputs=[ubryppxk-node-3-output-0], outputs=[ubryppxk-node-4-output-0], nodeDef={attrPaths=[input/input_data/features], attrs=[{is_na=false, ss=[contact_cellular]}], domain=stats, name=table_statistics, version=0.0.2}))

2024-09-12T14:34:28.170+08:00 INFO 1 --- [lt-executor-190] o.s.s.s.l.JobTaskLogEventListener : *** JobTaskLogEventListener emzj-ubryppxk-node-3 INITIALIZED INITIALIZED

2024-09-12T14:34:28.170+08:00 INFO 1 --- [lt-executor-190] o.s.s.s.l.JobTaskLogEventListener : *** JobTaskLogEventListener emzj-ubryppxk-node-4 INITIALIZED INITIALIZED

2024-09-12T14:34:28.270+08:00 INFO 1 --- [lt-executor-190] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Response: type: MODIFIED

object {

job_id: "emzj"

status {

state: "Failed"

create_time: "2024-09-12T06:34:27Z"

start_time: "2024-09-12T06:34:28Z"

tasks {

task_id: "emzj-ubryppxk-node-3"

state: "Failed"

err_msg: "KusciaTask failed after 3x retry, last error: failed to build domain bob kit info, failed to get appImage \"secretflow-image\" from cache, appimage.kuscia.secretflow \"secretflow-image\" not found"

create_time: "2024-09-12T06:34:28Z"

start_time: "2024-09-12T06:34:28Z"

end_time: "2024-09-12T06:34:28Z"

alias: "emzj-ubryppxk-node-3"

}

tasks {

task_id: "emzj-ubryppxk-node-4"

state: "Pending"

alias: "emzj-ubryppxk-node-4"

}

stage_status_list {

domain_id: "alice"

state: "JobCreateStageSucceeded"

}

stage_status_list {

domain_id: "bob"

state: "JobCreateStageSucceeded"

}

approve_status_list {

domain_id: "alice"

state: "JobAccepted"

}

approve_status_list {

domain_id: "bob"

state: "JobAccepted"

}

}

}

2024-09-12T14:34:28.271+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : starter jobEvent ... type: MODIFIED

object {

job_id: "emzj"

status {

state: "Failed"

create_time: "2024-09-12T06:34:27Z"

start_time: "2024-09-12T06:34:28Z"

tasks {

task_id: "emzj-ubryppxk-node-3"

state: "Failed"

err_msg: "KusciaTask failed after 3x retry, last error: failed to build domain bob kit info, failed to get appImage \"secretflow-image\" from cache, appimage.kuscia.secretflow \"secretflow-image\" not found"

create_time: "2024-09-12T06:34:28Z"

start_time: "2024-09-12T06:34:28Z"

end_time: "2024-09-12T06:34:28Z"

alias: "emzj-ubryppxk-node-3"

}

tasks {

task_id: "emzj-ubryppxk-node-4"

state: "Pending"

alias: "emzj-ubryppxk-node-4"

}

stage_status_list {

domain_id: "alice"

state: "JobCreateStageSucceeded"

}

stage_status_list {

domain_id: "bob"

state: "JobCreateStageSucceeded"

}

approve_status_list {

domain_id: "alice"

state: "JobAccepted"

}

approve_status_list {

domain_id: "bob"

state: "JobAccepted"

}

}

}

2024-09-12T14:34:28.271+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: jobId=emzj, jobState=Failed, task=[taskId=emzj-ubryppxk-node-3,alias=emzj-ubryppxk-node-3,state=Failed|taskId=emzj-ubryppxk-node-4,alias=emzj-ubryppxk-node-4,state=Pending], endTime=

2024-09-12T14:34:28.282+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: update job: it={

"type": "MODIFIED",

"object": {

"job_id": "emzj",

"status": {

"state": "Failed",

"create_time": "2024-09-12T06:34:27Z",

"start_time": "2024-09-12T06:34:28Z",

"tasks": [{

"task_id": "emzj-ubryppxk-node-3",

"state": "Failed",

"err_msg": "KusciaTask failed after 3x retry, last error: failed to build domain bob kit info, failed to get appImage \"secretflow-image\" from cache, appimage.kuscia.secretflow \"secretflow-image\" not found",

"create_time": "2024-09-12T06:34:28Z",

"start_time": "2024-09-12T06:34:28Z",

"end_time": "2024-09-12T06:34:28Z",

"alias": "emzj-ubryppxk-node-3"

}, {

"task_id": "emzj-ubryppxk-node-4",

"state": "Pending",

"alias": "emzj-ubryppxk-node-4"

}],

"stage_status_list": [{

"domain_id": "alice",

"state": "JobCreateStageSucceeded"

}, {

"domain_id": "bob",

"state": "JobCreateStageSucceeded"

}],

"approve_status_list": [{

"domain_id": "alice",

"state": "JobAccepted"

}, {

"domain_id": "bob",

"state": "JobAccepted"

}]

}

}

}

请问怎么把secretflow集成进来啊

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: each job status

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: kuscia status task_id: "emzj-ubryppxk-node-3"

state: "Pending"

create_time: "2024-09-12T06:34:28Z"

alias: "emzj-ubryppxk-node-3"

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: kuscia status emzj-ubryppxk-node-3 INITIALIZED task_id: "emzj-ubryppxk-node-3"

state: "Pending"

create_time: "2024-09-12T06:34:28Z"

alias: "emzj-ubryppxk-node-3"

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: sync result ProjectTaskDO(upk=ProjectTaskDO.UPK(projectId=irnyogit, jobId=emzj, taskId=emzj-ubryppxk-node-3), parties=[bob, alice], status=INITIALIZED, errMsg=, graphNodeId=ubryppxk-node-3, graphNode=ProjectGraphNodeDO(upk=ProjectGraphNodeDO.UPK(projectId=irnyogit, graphId=ubryppxk, graphNodeId=ubryppxk-node-3), codeName=data_prep/psi, label=隐私求交, x=-260, y=-100, inputs=[ubryppxk-node-1-output-0, ubryppxk-node-2-output-0], outputs=[ubryppxk-node-3-output-0], nodeDef={attrPaths=[input/receiver_input/key, input/sender_input/key, protocol, sort_result, allow_duplicate_keys, allow_duplicate_keys/no/skip_duplicates_check, fill_value_int, ecdh_curve], attrs=[{is_na=false, ss=[id1]}, {is_na=false, ss=[id2]}, {is_na=false, s=PROTOCOL_RR22}, {b=true, is_na=false}, {is_na=false, s=no}, {is_na=true}, {is_na=true}, {is_na=false, s=CURVE_FOURQ}], domain=data_prep, name=psi, version=0.0.5}))

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: kuscia status task_id: "emzj-ubryppxk-node-4"

state: "Pending"

alias: "emzj-ubryppxk-node-4"

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: kuscia status emzj-ubryppxk-node-4 INITIALIZED task_id: "emzj-ubryppxk-node-4"

state: "Pending"

alias: "emzj-ubryppxk-node-4"

2024-09-12T14:34:28.166+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: sync result ProjectTaskDO(upk=ProjectTaskDO.UPK(projectId=irnyogit, jobId=emzj, taskId=emzj-ubryppxk-node-4), parties=[bob, alice], status=INITIALIZED, errMsg=, graphNodeId=ubryppxk-node-4, graphNode=ProjectGraphNodeDO(upk=ProjectGraphNodeDO.UPK(projectId=irnyogit, graphId=ubryppxk, graphNodeId=ubryppxk-node-4), codeName=stats/table_statistics, label=全表统计, x=-260, y=20, inputs=[ubryppxk-node-3-output-0], outputs=[ubryppxk-node-4-output-0], nodeDef={attrPaths=[input/input_data/features], attrs=[{is_na=false, ss=[contact_cellular]}], domain=stats, name=table_statistics, version=0.0.2}))

2024-09-12T14:34:28.170+08:00 INFO 1 --- [lt-executor-190] o.s.s.s.l.JobTaskLogEventListener : *** JobTaskLogEventListener emzj-ubryppxk-node-3 INITIALIZED INITIALIZED

2024-09-12T14:34:28.170+08:00 INFO 1 --- [lt-executor-190] o.s.s.s.l.JobTaskLogEventListener : *** JobTaskLogEventListener emzj-ubryppxk-node-4 INITIALIZED INITIALIZED

2024-09-12T14:34:28.270+08:00 INFO 1 --- [lt-executor-190] o.s.s.k.v.i.KusciaGrpcLoggingInterceptor : [kuscia] kuscia-system Response: type: MODIFIED

object {

job_id: "emzj"

status {

state: "Failed"

create_time: "2024-09-12T06:34:27Z"

start_time: "2024-09-12T06:34:28Z"

tasks {

task_id: "emzj-ubryppxk-node-3"

state: "Failed"

err_msg: "KusciaTask failed after 3x retry, last error: failed to build domain bob kit info, failed to get appImage \"secretflow-image\" from cache, appimage.kuscia.secretflow \"secretflow-image\" not found"

create_time: "2024-09-12T06:34:28Z"

start_time: "2024-09-12T06:34:28Z"

end_time: "2024-09-12T06:34:28Z"

alias: "emzj-ubryppxk-node-3"

}

tasks {

task_id: "emzj-ubryppxk-node-4"

state: "Pending"

alias: "emzj-ubryppxk-node-4"

}

stage_status_list {

domain_id: "alice"

state: "JobCreateStageSucceeded"

}

stage_status_list {

domain_id: "bob"

state: "JobCreateStageSucceeded"

}

approve_status_list {

domain_id: "alice"

state: "JobAccepted"

}

approve_status_list {

domain_id: "bob"

state: "JobAccepted"

}

}

}

2024-09-12T14:34:28.271+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : starter jobEvent ... type: MODIFIED

object {

job_id: "emzj"

status {

state: "Failed"

create_time: "2024-09-12T06:34:27Z"

start_time: "2024-09-12T06:34:28Z"

tasks {

task_id: "emzj-ubryppxk-node-3"

state: "Failed"

err_msg: "KusciaTask failed after 3x retry, last error: failed to build domain bob kit info, failed to get appImage \"secretflow-image\" from cache, appimage.kuscia.secretflow \"secretflow-image\" not found"

create_time: "2024-09-12T06:34:28Z"

start_time: "2024-09-12T06:34:28Z"

end_time: "2024-09-12T06:34:28Z"

alias: "emzj-ubryppxk-node-3"

}

tasks {

task_id: "emzj-ubryppxk-node-4"

state: "Pending"

alias: "emzj-ubryppxk-node-4"

}

stage_status_list {

domain_id: "alice"

state: "JobCreateStageSucceeded"

}

stage_status_list {

domain_id: "bob"

state: "JobCreateStageSucceeded"

}

approve_status_list {

domain_id: "alice"

state: "JobAccepted"

}

approve_status_list {

domain_id: "bob"

state: "JobAccepted"

}

}

}

2024-09-12T14:34:28.271+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: jobId=emzj, jobState=Failed, task=[taskId=emzj-ubryppxk-node-3,alias=emzj-ubryppxk-node-3,state=Failed|taskId=emzj-ubryppxk-node-4,alias=emzj-ubryppxk-node-4,state=Pending], endTime=

2024-09-12T14:34:28.282+08:00 INFO 1 --- [lt-executor-190] o.s.s.m.integration.job.JobManager : watched jobEvent: update job: it={

"type": "MODIFIED",

"object": {

"job_id": "emzj",

"status": {

"state": "Failed",

"create_time": "2024-09-12T06:34:27Z",

"start_time": "2024-09-12T06:34:28Z",

"tasks": [{

"task_id": "emzj-ubryppxk-node-3",

"state": "Failed",

"err_msg": "KusciaTask failed after 3x retry, last error: failed to build domain bob kit info, failed to get appImage \"secretflow-image\" from cache, appimage.kuscia.secretflow \"secretflow-image\" not found",

"create_time": "2024-09-12T06:34:28Z",

"start_time": "2024-09-12T06:34:28Z",

"end_time": "2024-09-12T06:34:28Z",

"alias": "emzj-ubryppxk-node-3"

}, {

"task_id": "emzj-ubryppxk-node-4",

"state": "Pending",

"alias": "emzj-ubryppxk-node-4"

}],

"stage_status_list": [{

"domain_id": "alice",

"state": "JobCreateStageSucceeded"

}, {

"domain_id": "bob",

"state": "JobCreateStageSucceeded"

}],

"approve_status_list": [{

"domain_id": "alice",

"state": "JobAccepted"

}, {

"domain_id": "bob",

"state": "JobAccepted"

}]

}

}

}

请问怎么把secretflow集成进来啊

部署runp时是否遵循了以下说明 image

Meng-xiangkun commented 1 week ago

使用kuscia-secretflow镜像做隐私求交计算时出现这个错误 Failed to process object: error handling "dppm-qvxgwzap-node-35", failed to process kusciaTask "dppm-qvxgwzap-node-35", failed to build domain bob kit info, failed to get appImage "secretflow-image" from cache, appimage.kuscia.secretflow "secretflow-image" not found, retry Failed to update kuscia job "dppm" status, Operation cannot be fulfilled on kusciajobs.kuscia.secretflow "dppm": the object has been modified; please apply your changes to the latest version and try again image

2024-09-12 18:30:34.303 INFO resources/kusciajob.go:116 Start updating kuscia job "dppm" status

2024-09-12 18:30:34.317 INFO resources/kusciajob.go:118 Finish updating kuscia job "dppm" status

2024-09-12 18:30:34.317 INFO kusciajob/controller.go:298 Finished syncing KusciaJob "dppm" (13.420693ms)

2024-09-12 18:30:34.317 INFO queue/queue.go:124 Finish processing item: queue id[kuscia-job-controller], key[dppm] (13.470899ms)

2024-09-12 18:30:34.317 INFO resources/kusciajob.go:82 update kuscia job dppm

2024-09-12 18:30:34.329 INFO queue/queue.go:124 Finish processing item: queue id[kuscia-job-controller], key[dppm] (12.672843ms)

2024-09-12 18:30:34.330 INFO resources/kusciajob.go:116 Start updating kuscia job "dppm" status

2024-09-12 18:30:34.343 INFO resources/kusciajob.go:118 Finish updating kuscia job "dppm" status

2024-09-12 18:30:34.343 INFO kusciajob/controller.go:298 Finished syncing KusciaJob "dppm" (13.248207ms)

2024-09-12 18:30:34.343 INFO queue/queue.go:124 Finish processing item: queue id[kuscia-job-controller], key[dppm] (13.29884ms)

2024-09-12 18:30:34.345 INFO handler/job_scheduler.go:323 Create kuscia tasks: dppm-qvxgwzap-node-35

2024-09-12 18:30:34.357 INFO resources/kusciajob.go:116 Start updating kuscia job "dppm" status

2024-09-12 18:30:34.369 WARN kusciatask/controller.go:424 Error handling "dppm-qvxgwzap-node-35", re-queuing

2024-09-12 18:30:34.369 ERROR kusciatask/controller.go:435 Failed to process object: error handling "dppm-qvxgwzap-node-35", failed to process kusciaTask "dppm-qvxgwzap-node-35", failed to build domain bob kit info, failed to get appImage "secretflow-image" from cache, appimage.kuscia.secretflow "secretflow-image" not found, retry

2024-09-12 18:30:34.370 INFO resources/kusciajob.go:118 Finish updating kuscia job "dppm" status

2024-09-12 18:30:34.370 INFO kusciajob/controller.go:298 Finished syncing KusciaJob "dppm" (25.113735ms)

2024-09-12 18:30:34.370 INFO queue/queue.go:124 Finish processing item: queue id[kuscia-job-controller], key[dppm] (25.15742ms)

2024-09-12 18:30:34.370 INFO handler/job_scheduler.go:661 jobStatusPhaseFrom readyTasks={}, tasks={{taskId=dppm-qvxgwzap-node-35, dependencies=[], tolerable=false, phase=}}, kusciaJobId=dppm

2024-09-12 18:30:34.370 INFO resources/kusciajob.go:116 Start updating kuscia job "dppm" status

2024-09-12 18:30:34.383 WARN kusciatask/controller.go:424 Error handling "dppm-qvxgwzap-node-35", re-queuing

2024-09-12 18:30:34.383 ERROR kusciatask/controller.go:435 Failed to process object: error handling "dppm-qvxgwzap-node-35", failed to process kusciaTask "dppm-qvxgwzap-node-35", failed to build domain bob kit info, failed to get appImage "secretflow-image" from cache, appimage.kuscia.secretflow "secretflow-image" not found, retry

2024-09-12 18:30:34.385 INFO resources/kusciajob.go:118 Finish updating kuscia job "dppm" status

2024-09-12 18:30:34.386 INFO kusciajob/controller.go:298 Finished syncing KusciaJob "dppm" (15.795756ms)

2024-09-12 18:30:34.386 INFO queue/queue.go:124 Finish processing item: queue id[kuscia-job-controller], key[dppm] (15.879731ms)

2024-09-12 18:30:34.388 INFO handler/job_scheduler.go:661 jobStatusPhaseFrom readyTasks={}, tasks={{taskId=dppm-qvxgwzap-node-35, dependencies=[], tolerable=false, phase=}}, kusciaJobId=dppm

2024-09-12 18:30:34.388 INFO queue/queue.go:124 Finish processing item: queue id[kuscia-job-controller], key[dppm] (488.279µs)

2024-09-12 18:30:34.399 WARN kusciatask/controller.go:424 Error handling "dppm-qvxgwzap-node-35", re-queuing

2024-09-12 18:30:34.399 ERROR kusciatask/controller.go:435 Failed to process object: error handling "dppm-qvxgwzap-node-35", failed to process kusciaTask "dppm-qvxgwzap-node-35", failed to build domain bob kit info, failed to get appImage "secretflow-image" from cache, appimage.kuscia.secretflow "secretflow-image" not found, retry

2024-09-12 18:30:34.423 WARN kusciatask/controller.go:424 Error handling "dppm-qvxgwzap-node-35", re-queuing

2024-09-12 18:30:34.424 ERROR kusciatask/controller.go:435 Failed to process object: error handling "dppm-qvxgwzap-node-35", failed to process kusciaTask "dppm-qvxgwzap-node-35", failed to build domain bob kit info, failed to get appImage "secretflow-image" from cache, appimage.kuscia.secretflow "secretflow-image" not found, retry

2024-09-12 18:30:34.472 INFO resources/kusciatask.go:69 Start updating kuscia task "dppm-qvxgwzap-node-35" status

2024-09-12 18:30:34.488 INFO resources/kusciatask.go:71 Finish updating kuscia task "dppm-qvxgwzap-node-35" status

2024-09-12 18:30:34.488 INFO kusciatask/controller.go:521 Finished syncing kusciatask "dppm-qvxgwzap-node-35" (24.193535ms)

2024-09-12 18:30:34.490 INFO handler/job_scheduler.go:661 jobStatusPhaseFrom readyTasks={}, tasks={{taskId=dppm-qvxgwzap-node-35, dependencies=[], tolerable=false, phase=Failed}}, kusciaJobId=dppm

2024-09-12 18:30:34.490 INFO handler/job_scheduler.go:679 jobStatusPhaseFrom failed readyTasks={}, tasks={{taskId=dppm-qvxgwzap-node-35, dependencies=[], tolerable=false, phase=Failed}}, kusciaJobId=dppm

2024-09-12 18:30:34.491 WARN handler/failed_handler.go:62 Get task resource group dppm-qvxgwzap-node-35 failed, skip setting its status to failed, taskresourcegroup.kuscia.secretflow "dppm-qvxgwzap-node-35" not found

2024-09-12 18:30:34.491 INFO resources/kusciajob.go:116 Start updating kuscia job "dppm" status

2024-09-12 18:30:34.491 INFO resources/kusciatask.go:69 Start updating kuscia task "dppm-qvxgwzap-node-35" status

2024-09-12 18:30:34.505 INFO resources/kusciajob.go:118 Finish updating kuscia job "dppm" status

2024-09-12 18:30:34.505 INFO kusciajob/controller.go:298 Finished syncing KusciaJob "dppm" (14.950352ms)

2024-09-12 18:30:34.505 INFO queue/queue.go:124 Finish processing item: queue id[kuscia-job-controller], key[dppm] (14.972553ms)

2024-09-12 18:30:34.510 INFO resources/kusciajob.go:116 Start updating kuscia job "dppm" status

2024-09-12 18:30:34.510 INFO resources/kusciatask.go:71 Finish updating kuscia task "dppm-qvxgwzap-node-35" status

2024-09-12 18:30:34.510 INFO kusciatask/controller.go:521 Finished syncing kusciatask "dppm-qvxgwzap-node-35" (19.491329ms)

2024-09-12 18:30:34.510 INFO kusciatask/controller.go:489 KusciaTask "dppm-qvxgwzap-node-35" was finished, skipping

2024-09-12 18:30:34.523 INFO resources/kusciajob.go:118 Finish updating kuscia job "dppm" status

2024-09-12 18:30:34.523 INFO kusciajob/controller.go:298 Finished syncing KusciaJob "dppm" (13.33302ms)

2024-09-12 18:30:34.523 INFO queue/queue.go:124 Finish processing item: queue id[kuscia-job-controller], key[dppm] (13.376915ms)

2024-09-12 18:30:34.523 INFO resources/kusciajob.go:116 Start updating kuscia job "dppm" status

2024-09-12 18:30:34.534 WARN resources/kusciajob.go:122 Failed to update kuscia job "dppm" status, Operation cannot be fulfilled on kusciajobs.kuscia.secretflow "dppm": the object has been modified; please apply your changes to the latest version and try again

2024-09-12 18:30:34.542 INFO resources/kusciajob.go:116 Start updating kuscia job "dppm" status

2024-09-12 18:30:34.554 INFO resources/kusciajob.go:118 Finish updating kuscia job "dppm" status

2024-09-12 18:30:34.555 INFO kusciajob/controller.go:298 Finished syncing KusciaJob "dppm" (31.853225ms)

2024-09-12 18:30:34.555 INFO queue/queue.go:124 Finish processing item: queue id[kuscia-job-controller], key[dppm] (31.901265ms)

2024-09-12 18:30:34.555 INFO handler/job_scheduler.go:700 KusciaJob dppm was finished, skipping

2024-09-12 18:30:34.555 INFO kusciajob/controller.go:266 KusciaJob "dppm" should not reconcile again, skipping

2024-09-12 18:30:34.555 INFO queue/queue.go:124 Finish processing item: queue id[kuscia-job-controller], key[dppm] (111.519µs)