secretflow / kuscia

Kuscia(Kubernetes-based Secure Collaborative InfrA) is a K8s-based privacy-preserving computing task orchestration framework.
https://www.secretflow.org.cn/docs/kuscia/latest/zh-Hans
Apache License 2.0
73 stars 55 forks source link

建立了 domain 但是 domainroute 建立失败 #446

Closed ruhengChen closed 2 weeks ago

ruhengChen commented 1 month ago

Issue Type

Running

Search for existing issues similar to yours

Yes

OS Platform and Distribution

centos

Kuscia Version

0.9.0b0

Deployment

docker

deployment Version

18.06.1-ce

App Running type

secretflow

App Running version

1.7.0b0

Configuration file used to run kuscia.

mode: autonomy
domainID: alice
domainKeyData: ...
logLevel: INFO
runtime: runc
runk:
  namespace: ""
  dnsServers: []
  kubeconfigFile: ""
capacity:
  cpu: ""
  memory: ""
  pods: ""
  storage: ""
reservedResources:
  cpu: ""
  memory: ""
image:
  pullPolicy: ""
  defaultRegistry: ""
  registries: []
datastoreEndpoint: ""

What happend and What you expected to happen.

建立了 domain, 但是 domainroute 没有

scripts/deploy/add_domain.sh bob p2p
scripts/deploy/add_domain.sh clion p2p

[root@root-kuscia-autonomy-alice-alice kuscia]# kubectl get domain
NAME    AGE
bob     28m
clion   24m
alice   6d18h

[root@root-kuscia-autonomy-alice-alice kuscia]# kubectl get cdr
NAME          SOURCE   DESTINATION   HOST   AUTHENTICATION   READY
bob-alice     bob      alice                Token
clion-alice   clion    alice                Token

[root@root-kuscia-autonomy-alice-alice kuscia]# kubectl get domainroute -A
No resources found

### Kuscia log output.

```shell
2024-10-24 09:44:10.096 INFO domain/authorization_resource.go:197 Create clusterDomainRoute [bob-alice] success
2024-10-24 09:44:10.097 INFO domain/authorization_resource.go:116 Domain [bob] auth init completed
2024-10-24 09:44:10.101 ERROR domain/authorization_resource.go:126 Update domain [bob] auth label error: Operation cannot be fulfilled on domains.kuscia.secretflow "bob": the object has been modified; please apply your changes to the latest version and try again
2024-10-24 09:44:10.101 WARN domain/controller.go:374 update domain bob auth failed: Operation cannot be fulfilled on domains.kuscia.secretflow "bob": the object has been modified; please apply your changes to the latest version and try again
2024-10-24 09:44:10.101 INFO queue/queue.go:109 Re-syncing: queue id[domain-controller], retry:[0] key[bob]: "Operation cannot be fulfilled on domains.kuscia.secretflow \"bob\": the object has been modified; please apply your changes to the latest version and try again", re-queuing (32.976613ms)
2024-10-24 09:48:24.406 ERROR domain/authorization_resource.go:126 Update domain [clion] auth label error: Operation cannot be fulfilled on domains.kuscia.secretflow "clion": the object has been modified; please apply your changes to the latest version and try again
2024-10-24 09:48:24.406 WARN domain/controller.go:374 update domain clion auth failed: Operation cannot be fulfilled on domains.kuscia.secretflow "clion": the object has been modified; please apply your changes to the latest version and try again
2024-10-24 09:48:24.406 INFO queue/queue.go:109 Re-syncing: queue id[domain-controller], retry:[0] key[clion]: "Operation cannot be fulfilled on domains.kuscia.secretflow \"clion\": the object has been modified; please apply your changes to the latest version and try again", re-queuing (33.621903ms)
zimu-yuxi commented 1 month ago

是如何建立路由的?

ruhengChen commented 1 month ago

是通过脚本 scripts/deploy/add_domain.sh bob p2p scripts/deploy/add_domain.sh clion p2p

我看其他的节点上 创建了 domain 会自动创建 单向的路由 比如 bob 节点上创建了 alice ,会创建 alice-bob 的路由

但是现在我在 alice 节点,创建了 domain, 没有生成 bob-alice 和 clion-alice 的路由

ruhengChen commented 1 month ago

证书是一致的

[root@root-kuscia-autonomy-alice-alice kuscia]# cat var/certs/bob.domain.crt -----BEGIN CERTIFICATE----- MIIC+jCCAeKgAwIBAgIBATANBgkqhkiG9w0BAQsFADAOMQwwCgYDVQQDEwNib2Iw IBcNMjQxMDE3MDQxOTUzWhgPMjA3NDEwMTcwNDE5NTNaMA4xDDAKBgNVBAMTA2Jv YjCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAKWqccpyhO81hDtxGVpO n1G6htqXaOSlAv8a+CLFdOqRjnvsG9jVQWF4w1zP3naB6rjzM3uEE3p8pcO7lDzj 7lV/Y2Pj7iuRhZegx6kEV2FYbCgoA0IZ+i8f/6Zur38odOKftsxxK4wdrA4x5+24 Jr1Z0YRqpTS65IF+pBsvseSlxnaiJNgLekObnMB509TSnL1MOnObAijlV1kOsg1v BgQPJtELbFa4fbrnBBL9TUUZEC9Y/h1MJSxz6hthdP5qi/ZQ8RkBuWq540DFEBLY ugBYx9gu7QUtM9SJxilua/QLmaL6TdXG4f+kwIJNrNRgCVDQ6nJdCZUZdDjIcUMT QPcCAwEAAaNhMF8wDgYDVR0PAQH/BAQDAgKEMB0GA1UdJQQWMBQGCCsGAQUFBwMC BggrBgEFBQcDATAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBRy/zv3IzBixGKa a6xvPwQMHwlsezANBgkqhkiG9w0BAQsFAAOCAQEAZ8+1pqP3rOgl7jznQuqDMhAe vqN/A2qcNBe/g8+sfEO0CyjEtpUcZN8APrJJVpFo2JM+BcmQeOPKyqS3tYWA1KOX yX4ouNagYzdelxN5Ut6YBjC1O6XjOj3i69rwsb9oK7uZlawW+8SXZXkpUuL14l96 wedDzjciC/nUkqlphGnHkm7n3s0UHUo7XtvZrXhtSbhS5UqFAGRM7ScXATjVl5P9 zrau2300EVvv0u72hHrphst4afDUqpUtpjdX9fJQG8CBaY/aLpANcbz6Cm8FpwjF Ch4QtfPIwvbkGDlJDuKZzJEuB0dkSpvpO0eIjLPrPP+1tgVw0JOObWCaTGW8Dw== -----END CERTIFICATE-----

[root@root-kuscia-autonomy-bob-bob kuscia]# cat var/certs/domain.crt -----BEGIN CERTIFICATE----- MIIC+jCCAeKgAwIBAgIBATANBgkqhkiG9w0BAQsFADAOMQwwCgYDVQQDEwNib2Iw IBcNMjQxMDE3MDQxOTUzWhgPMjA3NDEwMTcwNDE5NTNaMA4xDDAKBgNVBAMTA2Jv YjCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAKWqccpyhO81hDtxGVpO n1G6htqXaOSlAv8a+CLFdOqRjnvsG9jVQWF4w1zP3naB6rjzM3uEE3p8pcO7lDzj 7lV/Y2Pj7iuRhZegx6kEV2FYbCgoA0IZ+i8f/6Zur38odOKftsxxK4wdrA4x5+24 Jr1Z0YRqpTS65IF+pBsvseSlxnaiJNgLekObnMB509TSnL1MOnObAijlV1kOsg1v BgQPJtELbFa4fbrnBBL9TUUZEC9Y/h1MJSxz6hthdP5qi/ZQ8RkBuWq540DFEBLY ugBYx9gu7QUtM9SJxilua/QLmaL6TdXG4f+kwIJNrNRgCVDQ6nJdCZUZdDjIcUMT QPcCAwEAAaNhMF8wDgYDVR0PAQH/BAQDAgKEMB0GA1UdJQQWMBQGCCsGAQUFBwMC BggrBgEFBQcDATAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBRy/zv3IzBixGKa a6xvPwQMHwlsezANBgkqhkiG9w0BAQsFAAOCAQEAZ8+1pqP3rOgl7jznQuqDMhAe vqN/A2qcNBe/g8+sfEO0CyjEtpUcZN8APrJJVpFo2JM+BcmQeOPKyqS3tYWA1KOX yX4ouNagYzdelxN5Ut6YBjC1O6XjOj3i69rwsb9oK7uZlawW+8SXZXkpUuL14l96 wedDzjciC/nUkqlphGnHkm7n3s0UHUo7XtvZrXhtSbhS5UqFAGRM7ScXATjVl5P9 zrau2300EVvv0u72hHrphst4afDUqpUtpjdX9fJQG8CBaY/aLpANcbz6Cm8FpwjF Ch4QtfPIwvbkGDlJDuKZzJEuB0dkSpvpO0eIjLPrPP+1tgVw0JOObWCaTGW8Dw== -----END CERTIFICATE-----

zimu-yuxi commented 1 month ago

参考这里,执行scripts/deploy/join_to_host.sh脚本后看下

ruhengChen commented 1 month ago

嗯嗯 这个之前也试过,我是删了重建的

现在重新执行后 bob 节点

[root@root-kuscia-autonomy-bob-bob kuscia]# kubectl get cdr
NAME        SOURCE   DESTINATION   HOST           AUTHENTICATION   READY
clion-bob   clion    bob                          Token            True
alice-bob   alice    bob                          Token
bob-clion   bob      clion         172.16.8.174   Token            True
bob-alice   bob      alice         172.16.8.166   Token            False

[root@root-kuscia-autonomy-bob-bob kuscia]# kubectl get domainroute -A
NAMESPACE   NAME        SOURCE   DESTINATION   HOST           AUTHENTICATION
bob         clion-bob   clion    bob                          Token
bob         alice-bob   alice    bob                          Token
bob         bob-clion   bob      clion         172.16.8.174   Token
bob         bob-alice   bob      alice         172.16.8.166   Token

alice 节点


[root@root-kuscia-autonomy-alice-alice kuscia]# kubectl get cdr
NAME          SOURCE   DESTINATION   HOST           AUTHENTICATION   READY
bob-alice     bob      alice                        Token
clion-alice   clion    alice                        Token
alice-bob     alice    bob           172.16.8.173   Token

[root@root-kuscia-autonomy-alice-alice kuscia]# kubectl get domainroute -A
No resources found

错误信息

[root@root-kuscia-autonomy-bob-bob kuscia]# kubectl get cdr bob-alice -o yaml
apiVersion: kuscia.secretflow/v1alpha1
kind: ClusterDomainRoute
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"kuscia.secretflow/v1alpha1","kind":"ClusterDomainRoute","metadata":{"annotations":{},"name":"bob-alice"},"spec":{"authenticationType":"Token","destination":"alice","endpoint":{"host":"172.16.8.166","ports":[{"isTLS":true,"name":"http","pathPrefix":"/","port":1080,"protocol":"HTTP"}]},"interConnProtocol":"kuscia","requestHeadersToAdd":{"Authorization":"Bearer"},"source":"bob","tokenConfig":{"rollingUpdatePeriod":86400,"tokenGenMethod":"RSA-GEN"}}}
  creationTimestamp: "2024-10-24T02:38:35Z"
  generation: 2
  labels:
    kuscia.secertflow/domainroute-partner: alice
    kuscia.secretflow/clusterdomainroute-destination: alice
    kuscia.secretflow/clusterdomainroute-source: bob
  name: bob-alice
  resourceVersion: "919429"
  uid: 05951b18-fbb5-4b69-b9c4-e043f9ddcf1d
spec:
  authenticationType: Token
  destination: alice
  endpoint:
    host: 172.16.8.166
    ports:
    - isTLS: true
      name: http
      pathPrefix: /
      port: 1080
      protocol: HTTP
  interConnProtocol: kuscia
  requestHeadersToAdd:
    Authorization: Bearer
  source: bob
  tokenConfig:
    destinationPublicKey: LS0tLS1CRUdJTiBSU0EgUFVCTElDIEtFWS0tLS0tCk1JSUJDZ0tDQVFFQXl5T0VoRXlCUzZua2cwOGRjWndwTTVRd3BTMUhhTHdBV0tNVDd6U1VRclcvQWwxUkR0RUwKRFllLzJ2MVNsWDlKdWkzTnJEM205R004Q2FhVEc3UHovVnl6eDJVWFFSUC9GQXZzS1Bob0VvVU02WHd0YmdMRQpjZGFDVnRCbGFYeUxuZFBwT21sV20xSlNMbXgwMm5sTnNjNlBYLzZpNE5IU1BHbjJucFFNNjNEdm9RZ1NManVtCm9JL3ZuNWxHVmd3dUd0bFlQbjlqcjNIdVd6U2dXKzVTa1gvRld4R2d0dFZYYTdvZDdWUHRVREtnMlhHd0d4VGEKSHAxL3lzM1JwOTVlL1lqYnd0N3VLRTc1eXp5M2FXT0QybHNWelpaZ1RrMWpjOWdsdGFINkVGTVpUOUJRUXVqUwovbUxQdGdzOTdRclVUbmpZa1IxWkFRbStWdW54Y1d1SWR3SURBUUFCCi0tLS0tRU5EIFJTQSBQVUJMSUMgS0VZLS0tLS0K
    rollingUpdatePeriod: 86400
    sourcePublicKey: LS0tLS1CRUdJTiBSU0EgUFVCTElDIEtFWS0tLS0tCk1JSUJDZ0tDQVFFQXBhcHh5bktFN3pXRU8zRVpXazZmVWJxRzJwZG81S1VDL3hyNElzVjA2cEdPZSt3YjJOVkIKWVhqRFhNL2Vkb0hxdVBNemU0UVRlbnlsdzd1VVBPUHVWWDlqWStQdUs1R0ZsNkRIcVFSWFlWaHNLQ2dEUWhuNgpMeC8vcG02dmZ5aDA0cCsyekhFcmpCMnNEakhuN2JnbXZWblJoR3FsTkxya2dYNmtHeSt4NUtYR2RxSWsyQXQ2ClE1dWN3SG5UMU5LY3ZVdzZjNXNDS09WWFdRNnlEVzhHQkE4bTBRdHNWcmg5dXVjRUV2MU5SUmtRTDFqK0hVd2wKTEhQcUcyRjAvbXFMOWxEeEdRRzVhcm5qUU1VUUV0aTZBRmpIMkM3dEJTMHoxSW5HS1c1cjlBdVpvdnBOMWNiaAovNlRBZ2syczFHQUpVTkRxY2wwSmxSbDBPTWh4UXhOQTl3SURBUUFCCi0tLS0tRU5EIFJTQSBQVUJMSUMgS0VZLS0tLS0K
    tokenGenMethod: RSA-GEN
status:
  conditions:
  - lastTransitionTime: "2024-10-24T02:40:37Z"
    lastUpdateTime: "2024-10-24T02:40:37Z"
    message: TokenNotGenerate
    reason: DestinationIsNotAuthrized
    status: "False"
    type: Ready
  endpointStatuses:
    root-kuscia-autonomy-bob-bob-http:
      endpointHealthy: true
  tokenStatus: {}
zimu-yuxi commented 1 month ago

1.双方都要建立,alice上有执行吗? 2.确认建立路由时的协议,ip,端口是否正确

ruhengChen commented 1 month ago

alice 上执行了 sh scripts/deploy/join_to_host.sh alice bob https://172.16.8.173:1080

bob 上执行了 sh scripts/deploy/join_to_host.sh bob alice https://172.16.8.166:1080

ip,端口 应该都是正确的

zimu-yuxi commented 1 month ago

alice上看下cdr,domainroute

ruhengChen commented 1 month ago

前面发过 2024-10-24 13 03 33@2x

zimu-yuxi commented 1 month ago

1.确认下这个是执行完join_to_host.sh脚本之后的信息吗? 2.在alice容器内kubectl get cdr alice-bob -o yaml看下 3.alice的kuscia.log发一下

ruhengChen commented 1 month ago
  1. 是的
  2. [root@root-kuscia-autonomy-alice-alice kuscia]# kubectl get cdr alice-bob -o yaml
    apiVersion: kuscia.secretflow/v1alpha1
    kind: ClusterDomainRoute
    metadata:
    annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"kuscia.secretflow/v1alpha1","kind":"ClusterDomainRoute","metadata":{"annotations":{},"name":"alice-bob"},"spec":{"authenticationType":"Token","destination":"bob","endpoint":{"host":"172.16.8.173","ports":[{"isTLS":true,"name":"http","pathPrefix":"/","port":1080,"protocol":"HTTP"}]},"interConnProtocol":"kuscia","requestHeadersToAdd":{"Authorization":"Bearer"},"source":"alice","tokenConfig":{"rollingUpdatePeriod":86400,"tokenGenMethod":"RSA-GEN"}}}
    creationTimestamp: "2024-10-24T02:38:30Z"
    generation: 2
    labels:
    kuscia.secertflow/domainroute-partner: alice
    kuscia.secretflow/clusterdomainroute-destination: bob
    kuscia.secretflow/clusterdomainroute-source: alice
    name: alice-bob
    resourceVersion: "919455"
    uid: 95e58078-8594-437f-8d45-803996acfbf9
    spec:
    authenticationType: Token
    destination: bob
    endpoint:
    host: 172.16.8.173
    ports:
    - isTLS: true
      name: http
      pathPrefix: /
      port: 1080
      protocol: HTTP
    interConnProtocol: kuscia
    requestHeadersToAdd:
    Authorization: Bearer
    source: alice
    tokenConfig:
    destinationPublicKey: LS0tLS1CRUdJTiBSU0EgUFVCTElDIEtFWS0tLS0tCk1JSUJDZ0tDQVFFQXBhcHh5bktFN3pXRU8zRVpXazZmVWJxRzJwZG81S1VDL3hyNElzVjA2cEdPZSt3YjJOVkIKWVhqRFhNL2Vkb0hxdVBNemU0UVRlbnlsdzd1VVBPUHVWWDlqWStQdUs1R0ZsNkRIcVFSWFlWaHNLQ2dEUWhuNgpMeC8vcG02dmZ5aDA0cCsyekhFcmpCMnNEakhuN2JnbXZWblJoR3FsTkxya2dYNmtHeSt4NUtYR2RxSWsyQXQ2ClE1dWN3SG5UMU5LY3ZVdzZjNXNDS09WWFdRNnlEVzhHQkE4bTBRdHNWcmg5dXVjRUV2MU5SUmtRTDFqK0hVd2wKTEhQcUcyRjAvbXFMOWxEeEdRRzVhcm5qUU1VUUV0aTZBRmpIMkM3dEJTMHoxSW5HS1c1cjlBdVpvdnBOMWNiaAovNlRBZ2syczFHQUpVTkRxY2wwSmxSbDBPTWh4UXhOQTl3SURBUUFCCi0tLS0tRU5EIFJTQSBQVUJMSUMgS0VZLS0tLS0K
    rollingUpdatePeriod: 86400
    sourcePublicKey: LS0tLS1CRUdJTiBSU0EgUFVCTElDIEtFWS0tLS0tCk1JSUJDZ0tDQVFFQXl5T0VoRXlCUzZua2cwOGRjWndwTTVRd3BTMUhhTHdBV0tNVDd6U1VRclcvQWwxUkR0RUwKRFllLzJ2MVNsWDlKdWkzTnJEM205R004Q2FhVEc3UHovVnl6eDJVWFFSUC9GQXZzS1Bob0VvVU02WHd0YmdMRQpjZGFDVnRCbGFYeUxuZFBwT21sV20xSlNMbXgwMm5sTnNjNlBYLzZpNE5IU1BHbjJucFFNNjNEdm9RZ1NManVtCm9JL3ZuNWxHVmd3dUd0bFlQbjlqcjNIdVd6U2dXKzVTa1gvRld4R2d0dFZYYTdvZDdWUHRVREtnMlhHd0d4VGEKSHAxL3lzM1JwOTVlL1lqYnd0N3VLRTc1eXp5M2FXT0QybHNWelpaZ1RrMWpjOWdsdGFINkVGTVpUOUJRUXVqUwovbUxQdGdzOTdRclVUbmpZa1IxWkFRbStWdW54Y1d1SWR3SURBUUFCCi0tLS0tRU5EIFJTQSBQVUJMSUMgS0VZLS0tLS0K
    tokenGenMethod: RSA-GEN
  3. kuscia.log

zimu-yuxi commented 2 weeks ago

想要确认下,问题是否解决?如果没有解决: 1.可以删除掉路由和domain,尝试重新授权。 2.确认下两台机器网络是否是通的,参考此处排查下

ruhengChen commented 2 weeks ago

重新部署了kuscia,可以了,应该是之前建立了一条和自己 cdr导致的,后面删除了也不行