cloudify-cosmo / cloudify-openstack-plugin

Cloudify OpenStack Plugin
20 stars 66 forks source link

Unpredictable SSH issues with OpenStack.server #334

Closed nfisdev closed 4 years ago

nfisdev commented 4 years ago

Hello, I have observed some unpredictable behavior in the openstack plugin while working on a kubespray implementation. The servers I am trying to create are referred to as "master" and "node"

Running my blueprint with just "master", SSH works just fine. As soon as I add the "node" openstack.server SSH breaks for both VMs, the error is often slightly different

Usually the error is ssh_exchange_identification: Connection closed by remote host, ssh: connect to host <host> port 22: Connection refused, I have even had it ask me for a password, even though there is no way that could be set up. The weird part is that sometimes its the "master" that SSH works for and sometimes the "node" that SSH works for, but neither of them work at the same time.

Running the exact same code, these have been my results:

  1. "Master" SSH works properly, "Node" SSH gives me the ssh_exchange_identification error
  2. "Node" SSH works properly, "Master" SSH gives me the ssh_exchange_identification error
  3. "Master" asks for a password, "Node" works properly.

I have spun up the same image in the Openstack UI and SSH works fine. When I remove the "node" openstack.server from the blueprint, SSH works for master.

Here is the blueprint that I am using.

tosca_definitions_version: cloudify_dsl_1_3

description: Install Kubernetes using Kubespray on Openstack.

imports:
  - http://cloudify.co/spec/cloudify/5.0.0/types.yaml
  - plugin:cloudify-openstack-plugin?version= >=3.2.2

inputs:
  master_ip:
    type: string
    default: 10.10.1.254
  node_ip:
    type: string
    default: 10.10.1.253
  public_subnet_cidr:
    type: string
    default: 10.10.1.0/24
  agent_key_public:
    type: string
    default: { get_secret: agent_key_public }
  auth_url:
    type: string
    default: { get_secret: openstack_auth_url }
  username:
    type: string
    default: { get_secret: openstack_username }
  password:
    type: string
    default: { get_secret: openstack_password }
  project_name:
    type: string
    default: { get_secret: openstack_project_name }
  region_name:
    description: >
      The Openstack region_name, for example RegionOne.
    type: string
    default: { get_secret: openstack_region }
  external_network_id:
    type: string
  image_id:
    description: The ID of Centos 7 image that is available in your account.
    type: string
    default: CentOS7
  flavor_id:
    description: flavour for OpenStack
    type: string
    default: mm1.small
  keypair_name:
    type: string
    default: ithiessen
  name_prefix:
    type: string
    default: k8-cfy
  agent_user:
    type: string
    default: centos
  client_config_dict:
    type: dict
    description: A dictionary containing the client configuration for Openstack. Default is for keystone v2. Provide an alternate input for keystone v3.
    default:
      username: { get_input: username }
      password: { get_input: password }
      project_name: { get_input: project_name }
      auth_url: { get_input: auth_url }
      region_name: { get_input: region_name }

node_templates:
  k8-cfy-master:
    type: cloudify.nodes.openstack.Server
    properties:
      client_config: { get_input: client_config_dict }
      agent_config:
        install_method: none
      resource_config:
        name: k8-cfy-master
        image_id: { get_input: image_id }
        flavor_id: { get_input: flavor_id }
    relationships:
      - type: cloudify.relationships.openstack.server_connected_to_port
        target: k8-cfy-master-port
      - type: cloudify.relationships.openstack.server_connected_to_keypair
        target: k8-cfy-keypair

  k8-cfy-node:
    type: cloudify.nodes.openstack.Server
    properties:
      client_config: { get_input: client_config_dict }
      agent_config:
        install_method: none
      resource_config:
        name: k8-cfy-node
        image_id: { get_input: image_id }
        flavor_id: { get_input: flavor_id }
    relationships:
      - type: cloudify.relationships.openstack.server_connected_to_port
        target: k8-cfy-node-port
      - type: cloudify.relationships.openstack.server_connected_to_keypair
        target: k8-cfy-keypair

  k8-cfy-keypair:
    type: cloudify.nodes.openstack.KeyPair
    properties:
      client_config: { get_input: client_config_dict }
      use_external_resource: true
      resource_config:
        name: k8
        public_key: { get_input: agent_key_public }

  k8-cfy-master-port:
    type: cloudify.nodes.openstack.Port
    properties:
      client_config: { get_input: client_config_dict }
      resource_config:
        name: 'k8-cfy-master-port'
        fixed_ips:
          - ip_address: { get_input: master_ip }
    relationships:
       - type: cloudify.relationships.connected_to
         target: k8-cfy-security-group
       - type: cloudify.relationships.connected_to
         target: k8-cfy-network
       - type: cloudify.relationships.connected_to
         target: k8-cfy-subnet
       - type: cloudify.relationships.openstack.port_connected_to_floating_ip
         target: k8-cfy-master-ip

  k8-cfy-node-port:
    type: cloudify.nodes.openstack.Port
    properties:
      client_config: { get_input: client_config_dict }
      resource_config:
        name: 'k8-cfy-node-port'
        fixed_ips:
          - ip_address: { get_input: node_ip }
    relationships:
       - type: cloudify.relationships.connected_to
         target: k8-cfy-security-group
       - type: cloudify.relationships.connected_to
         target: k8-cfy-network
       - type: cloudify.relationships.connected_to
         target: k8-cfy-subnet
       - type: cloudify.relationships.openstack.port_connected_to_floating_ip
         target: k8-cfy-node-ip

  k8-cfy-security-group:
    type: cloudify.nodes.openstack.SecurityGroup
    properties:
      client_config: { get_input: client_config_dict }
      security_group_rules:
        - remote_ip_prefix: 0.0.0.0/0
          port_range_max: 80
          port_range_min: 80
          direction: ingress
          protocol: tcp

        - remote_ip_prefix: 0.0.0.0/0
          port_range_max: 80
          port_range_min: 80
          direction: egress
          protocol: tcp

        - remote_ip_prefix: 0.0.0.0/0
          port_range_min: 53333
          port_range_max: 53333
          protocol: tcp
          direction: ingress

        - remote_ip_prefix: 0.0.0.0/0
          port_range_min: 53333
          port_range_max: 53333
          protocol: tcp
          direction: egress

        - remote_ip_prefix: 0.0.0.0/0
          port_range_max: 22
          port_range_min: 22
          direction: ingress
          protocol: tcp

        - remote_ip_prefix: 0.0.0.0/0
          port_range_max: 22
          port_range_min: 22
          direction: egress
          protocol: tcp
      resource_config:
        name: { concat: [ { get_input: name_prefix }, 'security-group' ] }
        description: 'A security group created by Cloudify OpenStack SDK plugin.'

  k8-cfy-network:
    type: cloudify.nodes.openstack.Network
    properties:
      client_config: { get_input: client_config_dict }
      resource_config:
        name: { concat: [ { get_input: name_prefix }, 'network' ] }

  k8-cfy-subnet:
    type: cloudify.nodes.openstack.Subnet
    properties:
      client_config: { get_input: client_config_dict }
      resource_config:
        name: { concat: [ { get_input: name_prefix }, 'subnet' ] }
        cidr: { get_input: public_subnet_cidr }
        enable_dhcp: true
        ip_version: 4
    relationships:
      - type: cloudify.relationships.contained_in
        target: k8-cfy-network
      - type: cloudify.relationships.openstack.subnet_connected_to_router
        target: k8-cfy-router

  k8-cfy-router:
    type: cloudify.nodes.openstack.Router
    properties:
      client_config: { get_input: client_config_dict }
      resource_config:
        name: { concat: [ { get_input: name_prefix }, 'router' ] }
    relationships:
      - type: cloudify.relationships.connected_to
        target: k8-cfy-external-network

  k8-cfy-external-network:
    type: cloudify.nodes.openstack.Network
    properties:
      client_config: { get_input: client_config_dict }
      use_external_resource: true
      resource_config:
        id: { get_input: external_network_id }

  k8-cfy-master-ip:
    type: cloudify.nodes.openstack.FloatingIP
    properties:
      client_config: { get_input: client_config_dict }
    relationships:
      - type: cloudify.relationships.connected_to
        target: k8-cfy-external-network

  k8-cfy-node-ip:
    type: cloudify.nodes.openstack.FloatingIP
    properties:
      client_config: { get_input: client_config_dict }
    relationships:
      - type: cloudify.relationships.connected_to
        target: k8-cfy-external-network

I get the same issue when using the kubernetes openstack example, which is why I started my own blueprint: https://github.com/cloudify-community/blueprint-examples/tree/master/kubernetes

Any help is greatly appreciated, hopefully I'm doing something obviously wrong. Thanks

EarthmanT commented 4 years ago

check the number of SSHD connections that your machine has and try increasing them. Or use a larger flavor. The kubespray playbook is a mammoth and keeping those connections alive is a major drag on the system.

nfisdev commented 4 years ago

I did increase the flavor significantly and I am still observing the same behavior.

As for the SSHD configuration, I have checked and the instances are set with the maximum of 10 connections. Could that be an issue? This test that I am running I am not trying to run any ansible playbooks.

My process is:

In this most recent case (with the increased flavor size) the "node" has the IP of 10.20.0.106, here I was able to log in:

[nfisdev@jump2 ~]$ ssh -i k8.pem centos@10.20.0.106
Warning: Permanently added '10.20.0.106' (RSA) to the list of known hosts.
[centos@k8-cfy-node ~]$ exit
logout
Connection to 10.20.0.106 closed.

The "master" has the IP of 10.20.0.189. Which times out when I try to SSH in. Here is the result of an Nmap scan against the "master", it looks like its not listening on any ports:

[nfisdev@jump2 ~]$  nmap -Pn 10.20.0.189

Starting Nmap 6.40 ( http://nmap.org ) at 2020-01-09 18:14 UTC
Nmap scan report for 10.20.0.189
Host is up.
All 1000 scanned ports on 10.20.0.189 are filtered

Nmap done: 1 IP address (1 host up) scanned in 201.37 seconds
nfisdev commented 4 years ago

Here are the results when I run the same blueprint with the "node" commented out.

To be as precise as possible, these are the exact lines that are removed from the blueprint above:

 k8-cfy-node:
    type: cloudify.nodes.openstack.Server
    properties:
      client_config: { get_input: client_config_dict }
      agent_config:
        install_method: none
      resource_config:
        name: k8-cfy-node
        image_id: { get_input: image_id }
        flavor_id: { get_input: flavor_id }
    relationships:
      - type: cloudify.relationships.openstack.server_connected_to_port
        target: k8-cfy-node-port
      - type: cloudify.relationships.openstack.server_connected_to_keypair
        target: k8-cfy-keypair

Here is the result of an Nmap scan on the "master" host:

[nfisdev@jump2 ~]$ nmap -Pn 10.20.0.106

Starting Nmap 6.40 ( http://nmap.org ) at 2020-01-09 19:03 UTC
Nmap scan report for 10.20.0.106
Host is up (0.0013s latency).
Not shown: 998 filtered ports
PORT   STATE  SERVICE
22/tcp open   ssh
80/tcp closed http

Nmap done: 1 IP address (1 host up) scanned in 4.87 seconds

After confirming the server is listening on 22, I attempted an SSH connection which was successful:

[nfisdev@jump2 ~]$ ssh -i k8.pem 10.20.0.106
The authenticity of host '10.20.0.106 (10.20.0.106)' can't be established.
ECDSA key fingerprint is 15:e4:3e:2d:87:55:72:06:05:48:3e:b4:40:57:0a:3e.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '10.20.0.106' (ECDSA) to the list of known hosts.
[centos@k8-cfy-master ~]$

Is there any reason why this works with 1 server but not with 2?

nfisdev commented 4 years ago

I have been having a lot of struggles with getting Cloudify to behave predictably over the last month, I have now found that my organization has had a faulty CentOS7 image.

@EarthmanT thank you so much for your help. I have this blueprint working now.

EarthmanT commented 4 years ago

@nfisdev sorry could not be more help I just was about to take a deeper look. glad you got it sorted out.