networkservicemesh / deployments-k8s

Apache License 2.0
42 stars 34 forks source link

Question: Monolith example EC2 + EKS #9226

Open trastle opened 1 year ago

trastle commented 1 year ago

Question

I am deploying a modified version of the Monolith example. I am having some success getting registration working but I am not seeing connections established.

Examples

The main error I see in the NSC init container is:

Jun 15 23:42:50.429 [ERRO] [cmd:[/bin/app]] [retryClient:Request] try attempt has failed: Error returned from api/pkg/api/networkservice/networkServiceClient.Request: rpc error: code = Unknown desc = an error occurred during authorization policy check: rpc error: code = PermissionDenied desc = no sufficient privileges: cannot support any of the requested mechanism

The main error that I see on the NSE is:

Jun 15 23:45:05.144 [ERRO] [id:nsm] [type:networkService] (11.1)             Peer link not found

Context

The primary differences between my setup and the setup in the example: Basics, I am using…

Other artifacts/Logs

Cluster info dump

https://drive.google.com/file/d/1KS3W_VEPUUH1s4jUgamaU_1m8kNG4Jry/view?usp=sharing

External NSE logs

https://drive.google.com/file/d/13fsFB29isPmvmMq279UaK537ietAwekD/view?usp=drive_link

trastle commented 1 year ago

K8s Spire Server:

apiVersion: v1
data:
  server.conf: |
    server {
        bind_address = "0.0.0.0"
        bind_port = "8081"
        trust_domain = "use.experimental.k8s.ikarem.io"
        data_dir = "/run/spire/data"
        log_level = "DEBUG"
        #AWS requires the use of RSA.  EC cryptography is not supported
        ca_key_type = "rsa-2048"
        default_x509_svid_ttl = "1h"
        default_jwt_svid_ttl = "1h"
        ca_subject = {
            country = ["US"],
            organization = ["SPIFFE"],
            common_name = "",
        }
        # Federation config was added here for unification of Spire setups
        # This config will do nothing until Spiffe Federation bundles are configured manually
        federation {
            bundle_endpoint {
                address = "0.0.0.0"
                port = 8443
            }
            federates_with "docker.nsm" {
                bundle_endpoint_url = "https://external-nse.usw.experimental.k8s.ikarem.io:8443"
                bundle_endpoint_profile "https_spiffe" {
                    endpoint_spiffe_id = "spiffe://docker.nsm/spire/server"
                }
            }
        }
    }

    plugins {
        DataStore "sql" {
            plugin_data {
                database_type = "sqlite3"
                connection_string = "/run/spire/data/datastore.sqlite3"
            }
        }

        NodeAttestor "k8s_psat" {
            plugin_data {
                clusters = {
                # NOTE: Change this to your cluster name
                    "experimental-use" = {
                        use_token_review_api_validation = true
                        service_account_allow_list = ["spire:spire-agent"]
                    }
                }
            }
        }

        KeyManager "disk" {
            plugin_data {
                keys_path = "/run/spire/data/keys.json"
            }
        }
        Notifier "k8sbundle" {
            plugin_data {
                webhook_label = "spiffe.io/webhook"
            }
        }
    }
kind: ConfigMap
metadata:
  name: spire-server
  namespace: spire
trastle commented 1 year ago

K8s Spire Agent

❯ kubectl get configmap spire-agent -n spire -o yaml
apiVersion: v1
data:
  agent.conf: |
    agent {
        data_dir = "/run/spire"
        log_level = "DEBUG"
        server_address = "spire-server"
        server_port = "8081"
        socket_path = "/run/spire/sockets/agent.sock"
        trust_bundle_path = "/run/spire/bundle/bundle.crt"
        trust_domain = "use.experimental.k8s.ikarem.io"
    }

    plugins {
        NodeAttestor "k8s_psat" {
            plugin_data {
                cluster = "experimental-use"
            }
        }

        KeyManager "memory" {
            plugin_data {}
        }

        WorkloadAttestor "k8s" {
            plugin_data {
                # Defaults to the secure kubelet port by default.
                # Minikube does not have a cert in the cluster CA bundle that
                # can authenticate the kubelet cert, so skip validation.
                skip_kubelet_verification = true
            }
        }
        WorkloadAttestor "unix" {
            plugin_data {}
        }
    }
kind: ConfigMap
metadata:
  name: spire-agent
  namespace: spire
trastle commented 1 year ago

Docker Spire Server:

cat /tmp/spire1451413651/server/server.conf
server {
    bind_address = "127.0.0.1"
    bind_port = "8081"
    trust_domain = "docker.nsm"
    data_dir = "/tmp/spire1451413651/data"
    log_level = "WARN"
    ca_key_type = "rsa-2048"
    default_svid_ttl = "1h"
    ca_subject = {
        country = ["US"],
        organization = ["SPIFFE"],
        common_name = "",
    }
    federation {
        bundle_endpoint {
            address = "0.0.0.0"
            port = 8443
        }
        federates_with "use.experimental.k8s.ikarem.io" {
            bundle_endpoint_url = "https://spire-server.spire.use.experimental.k8s.ikarem.io:8443"
            bundle_endpoint_profile "https_spiffe" {
                endpoint_spiffe_id = "spiffe://use.experimental.k8s.ikarem.io/spire/server"
            }
        }
    }
}

plugins {
    DataStore "sql" {
        plugin_data {
            database_type = "sqlite3"
            connection_string = "/tmp/spire1451413651/data/datastore.sqlite3"
        }
    }

    NodeAttestor "join_token" {
        plugin_data {
        }
    }

    KeyManager "memory" {
        plugin_data = {}
    }
}
trastle commented 1 year ago

Docker Spire Agent

cat /tmp/spire1451413651/agent/agent.conf
agent {
    data_dir = "/tmp/spire1451413651/data"
    log_level = "WARN"
    server_address = "127.0.0.1"
    server_port = "8081"
    insecure_bootstrap = true
    trust_domain = "docker.nsm"
}

plugins {
    NodeAttestor "join_token" {
        plugin_data {
        }
    }
    KeyManager "disk" {
        plugin_data {
            directory = "/tmp/spire1451413651/data"
        }
    }
    WorkloadAttestor "unix" {
        plugin_data {
            discover_workload_path = true
        }
    }
}
denis-tingaikin commented 1 year ago

@trastle Thanks for details!

Also I did not find logs from the nsc/nsmgr/reigstry/forwarder from the k8s cluster. By some reason cluster dump did not collect all stuff. So if you have a possible please attach logs from nsc/nsmgr/reigstry/forwarder too.

I saw in logs problems with registry authz, so probabbly the use-case could work with NSM v1.6.1 https://github.com/networkservicemesh/deployments-k8s/tree/release/v1.6.1/examples/k8s_monolith

Let me know if you test it.

trastle commented 1 year ago

Thanks Denis. New log uploaded here: https://drive.google.com/file/d/1LlW0NyZaoO7q_GY5f5JRnE6LvB0GbxiU/view

trastle commented 1 year ago

I'm not sure I follow the comment about v1.6.1 Do you want me to try downgrading NSM to the older version?

I can run the integration test for https://github.com/networkservicemesh/deployments-k8s/tree/release/v1.8.0/examples/k8s_monolith so I presume the use case works correctly on there however once I modify the deployment to use EKS and an EC2 instance rather than everything being in containers on the Kind network I am not having success getting connections.

Which log messages are you seeing that indicate a failure in the registry Authz? I would like to understand how to observe the same.

denis-tingaikin commented 1 year ago

Thanks for updating logs.

Good new there is no problems with spire.

As I can see the issue in the logs that forwarder from k8s cluster can not connect to the forwawrder part of the monolith nse by url tcp://10.1.101.46:32965.

So k8s cluster part of the setup looks good and we should focus on the monolith nse.

Could you restart the example and start capture logs from the monolith nse with flag --followright at start? (note: use follow + writing to the file to do not lost logs)

I think we're missing critical logs from the start of the monolith nse that could help with next steps .