sap-oc / cookbook-openstack-network

Chef Cookbook - OpenStack Network
http://openstack.org
0 stars 0 forks source link

OCF resource should not replicate DHCP #10

Open matelakat opened 7 years ago

matelakat commented 7 years ago

The neutron-ha-tool's resource script does DHCP replication on start, which should not be done, as neutron by default takes care of this through the dhcp_agents_per_network parameter.

matelakat commented 7 years ago

https://review.openstack.org/#/c/463320/

matelakat commented 7 years ago

Waiting for review

matelakat commented 7 years ago

As per @aspiers and Ralf, we need to check if neutron takes care of DHCP server replication if one of the agents die out.

aspiers commented 7 years ago

It looks like it probably does: https://github.com/openstack/neutron/blob/master/neutron/scheduler/dhcp_agent_scheduler.py but we still need to verify that before removing the functionality.

matelakat commented 7 years ago

Building a cloud to test this.

matelakat commented 7 years ago

Created an ha cloud with 3 network nodes. I can see that initially all network nodes were launching a DHCP server. As a first step, I need to make sure that DHCP servers per network is less than the number of network nodes:

on crowbar, I modified the dhcp_agents_per_network parameter TODO: check how is it done

mkdir /root/10
cd /root/10 
cp /opt/dell/chef/cookbooks/neutron/templates/default/neutron.conf.erb ./
sed -i -e 's,^dhcp_agents_per_network.*$,dhcp_agents_per_network = 2,g' /opt/dell/chef/cookbooks/neutron/templates/default/neutron.conf.erb
knife cookbook upload -o /opt/dell/chef/cookbooks/ neutron
for node in $(knife search node 'roles:neutron-network' -i | grep "^d52"); do ssh $node chef-client; done

After the change I re-started the L3 agents as well. TODO: check if it is done by the cookbook

And created a new network:

openstack network create testnetwork
neutron subnet-create --name testsubnet testnetwork 192.168.37.0/24

And verified that it is only running on two of the DHCP agents:

root@d52-54-77-77-01-01:~ # for agent in $(neutron agent-list -F id -F agent_type -f value | grep DHCP | cut -d " " -f 1); do neutron net-list-on-dhcp-agent $agent -F id -f value; done | wc
      2       8     204

After that one L3 agent was shut down by halting the host machine.

After some time, the move has happened:

root@d52-54-77-77-01-01:~ # for agent in $(neutron agent-list -F id -F agent_type -f value | grep DHCP | cut -d " " -f 1); do echo "on $agent:"; neutron net-list-on-dhcp-agent $agent -F id -f value; done
on 7db64350-44ef-40ec-9290-3ea64354420d:
on c17300b8-f52d-4b5f-8bf3-87a401d0c23c:
b7d1b52d-7b8a-4148-a03f-0f274703c8a5 testnetwork 241acb77-79d0-4afc-9081-f5ce10ef74ea 192.168.37.0/24
on 09775f1b-9bdb-4bc4-b553-d4041f7dbb95:
b7d1b52d-7b8a-4148-a03f-0f274703c8a5 testnetwork 241acb77-79d0-4afc-9081-f5ce10ef74ea 192.168.37.0/24
root@d52-54-77-77-01-01:~ # neutron agent-show c17300b8-f52d-4b5f-8bf3-87a401d0c23c | grep alive
| alive               | False
root@d52-54-77-77-01-01:~ # for agent in $(neutron agent-list -F id -F agent_type -f value | grep DHCP | cut -d " " -f 1); do echo "on $agent:"; neutron net-list-on-dhcp-agent $agent -F id -f value; done
on 09775f1b-9bdb-4bc4-b553-d4041f7dbb95:
b7d1b52d-7b8a-4148-a03f-0f274703c8a5 testnetwork 241acb77-79d0-4afc-9081-f5ce10ef74ea 192.168.37.0/24
on c17300b8-f52d-4b5f-8bf3-87a401d0c23c:
on 7db64350-44ef-40ec-9290-3ea64354420d:
b7d1b52d-7b8a-4148-a03f-0f274703c8a5 testnetwork 241acb77-79d0-4afc-9081-f5ce10ef74ea 192.168.37.0/24

But the change definitely took some time

aspiers commented 6 years ago

I finally noticed this and have merged https://review.openstack.org/#/c/463320/