Juniper / contrail-ansible-deployer

Ansible deployment for contrail
Apache License 2.0
60 stars 65 forks source link

Remote Compute Deployment [Collector down] #47

Open majchwo opened 5 years ago

majchwo commented 5 years ago

Hello, I try to deploy environment with remote compute feature. Here is my instances file :

remote_locations:
    pop2:
      BGP_ASN: 12345
      SUBCLUSTER: pop2
      XMPP_SERVER_PORT: 5269
      BGP_PORT: 179
      CONTROL_INTROSPECT_LISTEN_PORT: 8083
provider_config:
    bms:
  instances:
    bms_contr:
      ip: 10.10.0.100
      provider: bms
      roles:
        config_database:
        config:
        control:
        analytics_database:
        analytics:
        analytics_alarm:
        webui:
        openstack_control:
        openstack_network:
    remoteCp_node:
      ip: 10.10.0.150
      provider: bms
      roles:
        control_only:
          location: pop2
          PHYSICAL_INTERFACE: em2
          DEFAULT_LOCAL_IP: 10.10.0.150
          DEFAULT_IFACE: em2
    bms_compute:
      ip: 10.40.0.100
      provider: bms
      roles:
        vrouter:
           CONTROL_NODES: 10.10.0.150
           VROUTER_GATEWAY: 10.40.0.1
           location: pop2
           PHYSICAL_INTERFACE: "em1"
        openstack_compute:
           network_interface: "em1"
        openstack_storage:
global_configuration:
    CONTAINER_REGISTRY: opencontrailnightly
contrail_configuration:
    CONTRAIL_VERSION: latest
    CLOUD_ORCHESTRATOR: openstack
    RABBITMQ_NODE_PORT: 5673
    AUTH_MODE: keystone
    KEYSTONE_AUTH_URL_VERSION: /v3
    KEYSTONE_AUTH_ADMIN_PASSWORD: contrail123
    UPGRADE_KERNEL: false
    AAA_MODE: rbac
    METADATA_PROXY_SECRET: contrail123
    CONTROLLER_NODES: 10.10.0.100
    CONTROL_DATA_NET_LIST: 10.10.0.0/24,10.40.0.0/24
    CONFIGDB_NODES: 10.10.0.100
kolla_config:
  kolla_globals:
    network_interface: "em2"
    kolla_external_vip_interface: "em2"
    kolla_external_vip_address: "10.10.0.100"
    kolla_internal_vip_address: "10.10.0.100"
    kolla_internal_vip_interface: "em2"
    enable_haproxy: "no"
    enable_ironic: "no"
    enable_swift: "yes"
    enable_cinder: "yes"
    enable_cinder_backend_lvm: "yes"
    enable_ceph: "no"
    cinder_backup_driver: "swift"
    horizon_keystone_multidomain: true

After ansible deployment in almost every log file i get such error:

SANDESH: [DROP: WrongClientSMState] NodeStatusUVE: data = << name = process_status = [ << module_id = contrail-control-nodemgr instance_id = 0 state = Non-Functional connection_infos = [ << type = Collector name = server_addrs = [ 10.10.0.100:8086, ] status = Initializing description = Idle to Connect on EvIdleHoldTimerExpired >>, ] description = Collector connection down >>, ] >>

Collector log file:

SANDESH: Send FAILED: 1566470335049259 [SYS_NOTICE]: NodeStatusUVE: data= [ name = process_status= [ [ [ module_id = contrail-collector instance_id = 0 state = Non-Functional connection_infos= [ [ [ type = Collector name = server_addrs= [ [ (_iter6) = 10.10.0.100:8086, ] ] status = Initializing description = Connect : EvTcpConnected ], [ type = Database name = Cassandra server_addrs= [ [ (_iter6) = 10.10.0.100:9041, ] ] status = Up description = Established Cassandra connection ], [ type = Database name = RabbitMQ server_addrs= [ [ (_iter6) = 10.10.0.100:5673, ] ] status = Up description = RabbitMQ connection established ], [ type = Database name = :Global server_addrs= [ [ (_iter6) = 10.10.0.100:9042, ] ] status = Up description = ], [ type = Redis-UVE name = From server_addrs= [ [ (_iter6) = 127.0.0.1:6379, ] ] status = Up description = ], [ type = Redis-UVE name = To server_addrs= [ [ (_iter6) = 127.0.0.1:6379, ] ] status = Up description = ], [ type = KafkaPub name = 10.10.0.100:9092 server_addrs= [ [ (*_iter6) = 0.0.0.0:0, ] ] status = Down description = ], ] ] description = Collector, KafkaPub:10.10.0.100:9092 connection down ], ] ] ]

Contrail-status on main control node:

== Contrail control == control: active nodemgr: active named: active dns: active

== Contrail analytics-alarm == nodemgr: active kafka: active alarm-gen: active

== Contrail database == nodemgr: active query-engine: active cassandra: active

== Contrail analytics == nodemgr: active api: active collector: active

== Contrail config-database == nodemgr: active zookeeper: active rabbitmq: active cassandra: active

== Contrail webui == web: active job: active

== Contrail device-manager ==

== Contrail config == svc-monitor: active nodemgr: active device-manager: active api: active schema: active

I simulate provider network on gns3 so its not connectivity between hosts issue. Openstack control node sees remote compute node.

When I deploy environment without analytics_alarm (and thus Kafka) I get such log in collector log file (other in other log files error is the same, Collector connection down):

[Thread 140371426252544, Pid 1]: SANDESH: Send FAILED: 1566481747362411 [SYS_NOTICE]: NodeStatusUVE: data= [ name = process_status= [ [ [ module_id = contrail-collector instance_id = 0 state = Non-Functional connection_infos= [ [ [ type = Collector name = server_addrs= [ [ (_iter6) = 10.10.0.100:8086, ] ] status = Initializing description = Connect : EvTcpConnected ], [ type = Database name = Cassandra server_addrs= [ [ (_iter6) = 10.10.0.100:9041, ] ] status = Up description = Established Cassandra connection ], [ type = Database name = RabbitMQ server_addrs= [ [ (_iter6) = 10.10.0.100:5673, ] ] status = Up description = RabbitMQ connection established ], [ type = Database name = :Global server_addrs= [ [ (_iter6) = 10.10.0.100:9042, ] ] status = Up description = ], [ type = Redis-UVE name = From server_addrs= [ [ (_iter6) = 127.0.0.1:6379, ] ] status = Up description = Redis(From) handling the auth callback ], [ type = Redis-UVE name = To server_addrs= [ [ (_iter6) = 127.0.0.1:6379, ] ] status = Up description = Redis(To) connecting to CallbackProcess ], ] ] description = Collector connection down ], ] ] ]

Thank you in advance, Wojtek

menkeyi commented 4 years ago

I also encountered this problem, did you solve it?

[root@opcompute /]# egrep -v "^#|^$" /etc/contrail/contrail-vrouter-agent.conf [CONTROL-NODE] servers=10.49.252.201:5269 [DEFAULT] http_server_ip=0.0.0.0 collectors=10.49.252.202:8086 log_file=/var/log/contrail/contrail-vrouter-agent.log log_level=SYS_NOTICE log_local=1 hostname=bogon agent_name=bogon xmpp_dns_auth_enable=False xmpp_auth_enable=False physical_interface_mac = 00:0c:29:61:d4:f3 tsn_servers = [] [SANDESH] introspect_ssl_enable=False sandesh_ssl_enable=False [NETWORKS] control_network_ip=10.49.252.202 [DNS] servers=10.49.252.201:53 [METADATA] metadata_proxy_secret=contrail [VIRTUAL-HOST-INTERFACE] name=vhost0 ip=192.168.100.2/24 physical_interface=ens192 compute_node_address=192.168.100.2 gateway=10.49.252.23 [SERVICE-INSTANCE] netns_command=/usr/bin/opencontrail-vrouter-netns docker_command=/usr/bin/opencontrail-vrouter-docker [HYPERVISOR] type = kvm [FLOWS] fabric_snat_hash_table_size = 4096 [SESSION] slo_destination = collector sample_destination = collector

The physical_interface interface is changed to collectors ip (10.49.252.202) network card interface, and it can work normally

majchwo commented 4 years ago

Unfortunately not, but frankly i ignored it and my envioronment worked despite this error.

Regards, Wojtek

czw., 12.12.2019, 03:32 użytkownik dataguru notifications@github.com napisał:

I also encountered this problem, did you solve it?

[root@opcompute /]# egrep -v "^#|^$" /etc/contrail/contrail-vrouter-agent.conf [CONTROL-NODE] servers=10.49.252.201:5269 [DEFAULT] http_server_ip=0.0.0.0 collectors=10.49.252.202:8086 log_file=/var/log/contrail/contrail-vrouter-agent.log log_level=SYS_NOTICE log_local=1 hostname=bogon agent_name=bogon xmpp_dns_auth_enable=False xmpp_auth_enable=False physical_interface_mac = 00:0c:29:61:d4:f3 tsn_servers = [] [SANDESH] introspect_ssl_enable=False sandesh_ssl_enable=False [NETWORKS] control_network_ip=10.49.252.202 [DNS] servers=10.49.252.201:53 [METADATA] metadata_proxy_secret=contrail [VIRTUAL-HOST-INTERFACE] name=vhost0 ip=192.168.100.2/24 physical_interface=ens192 compute_node_address=192.168.100.2 gateway=10.49.252.23 [SERVICE-INSTANCE] netns_command=/usr/bin/opencontrail-vrouter-netns docker_command=/usr/bin/opencontrail-vrouter-docker [HYPERVISOR] type = kvm [FLOWS] fabric_snat_hash_table_size = 4096 [SESSION] slo_destination = collector sample_destination = collector

The physical_interface interface is changed to collectors ip (10.49.252.202) network card interface, and it can work normally

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Juniper/contrail-ansible-deployer/issues/47?email_source=notifications&email_token=AFMHXXCICFJC3JIKYMPMSN3QYGPD7A5CNFSM4IOVNNL2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGVIF6A#issuecomment-564822776, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFMHXXAYD4ROJVRQYVGIB2DQYGPD7ANCNFSM4IOVNNLQ .