F5Networks / f5-openstack-agent

The F5 Agent for OpenStack Neutron allows you to deploy BIG-IP services in an OpenStack environment.
http://clouddocs.f5.com/products/openstack/agent/latest
Apache License 2.0
14 stars 38 forks source link

New objects added when Agent service is down causes a rolling traceback over new objects #599

Open ssorenso opened 7 years ago

ssorenso commented 7 years ago

Agent Version

Latest mitaka available to testlab

Operating System

CentOS 7 (testlab)

OpenStack Release

Mitaka (whatever's in testlab)

Description

A rolling traceback occurs with the iControl REST interface via the following steps:

  1. Create a tlc overcloud tempest session
    • I used 11.6.0
  2. From neutron create a playground subnet
    • I used GW: 10.22.22.1; Allocation Pool: 10.22.22.2 - 10.22.22.40; with a 10.22.22.0/22 CIDR
  3. From neutron create a loadbalancer on this subnet
  4. From neutron create a listener on this loadbalancer
  5. From neutron create a pool on this listener
  6. Create an initial member (may not be necessary) on this pool
  7. Run systemctl f5-openstack-agent stop
  8. Create a new member using neutron using the same pool
  9. Run systemctl f5-openstack-agent start

Rolling traceback trace: Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/f5_openstack_agent/lbaasv2/drivers/bigip/icontrol_driver.py", line 1235, in _common_service_handler all_subnet_hints) File "/usr/lib/python2.7/site-packages/f5_openstack_agent/lbaasv2/drivers/bigip/lbaas_builder.py", line 70, in assure_service self._assure_members(service, all_subnet_hints) File "/usr/lib/python2.7/site-packages/f5_openstack_agent/lbaasv2/drivers/bigip/lbaas_builder.py", line 255, in _assure_members self.pool_builder.update_member(svc, bigips) File "/usr/lib/python2.7/site-packages/f5_openstack_agent/lbaasv2/drivers/bigip/pool_service.py", line 185, in update_member m.modify(**member) File "/usr/lib/python2.7/site-packages/f5/bigip/resource.py", line 409, in modify self._modify(**patch) File "/usr/lib/python2.7/site-packages/f5/bigip/resource.py", line 401, in _modify response = session.patch(patch_uri, json=patch, **requests_params) File "/usr/lib/python2.7/site-packages/icontrol/session.py", line 272, in wrapper raise iControlUnexpectedHTTPError(error_message, response=response) iControlUnexpectedHTTPError: 400 Unexpected Error: Bad Request for uri: https://10.190.3.71:443/mgmt/tm/ltm/pool/~Project_cfc0a9759d684db3840f3b174645a87f~Project_1c9e1f1c-0703-4c3c-b982-f477d3462c41/members/~Project_cfc0a9759d684db3840f3b174645a87f~10.22.22.6:80/ Text: u' {"code":400,"message":"\\"address\\" may not be specified in the context of the \\"modify\\" command. \\"address\\" may be specified using the following commands: create, list, show","errorStack":[]}

After waiting for quite some time, I was not able to see the new member on 10.22.22.7 added: (screen shot can be provided upon request...)

Deployment

Standard testlab configuration with tempest tlc/overcloud.

ssorenso commented 7 years ago

This also appears to happen for health-monitors as well: Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/f5_openstack_agent/lbaasv2/drivers/bigip/icontrol_driver.py", line 1235, in _common_service_handler all_subnet_hints) File "/usr/lib/python2.7/site-packages/f5_openstack_agent/lbaasv2/drivers/bigip/lbaas_builder.py", line 70, in assure_service self._assure_members(service, all_subnet_hints) File "/usr/lib/python2.7/site-packages/f5_openstack_agent/lbaasv2/drivers/bigip/lbaas_builder.py", line 255, in _assure_members self.pool_builder.update_member(svc, bigips) File "/usr/lib/python2.7/site-packages/f5_openstack_agent/lbaasv2/drivers/bigip/pool_service.py", line 185, in update_member m.modify(**member) File "/usr/lib/python2.7/site-packages/f5/bigip/resource.py", line 409, in modify self._modify(**patch) File "/usr/lib/python2.7/site-packages/f5/bigip/resource.py", line 401, in _modify response = session.patch(patch_uri, json=patch, **requests_params) File "/usr/lib/python2.7/site-packages/icontrol/session.py", line 272, in wrapper raise iControlUnexpectedHTTPError(error_message, response=response) iControlUnexpectedHTTPError: 400 Unexpected Error: Bad Request for uri: https://10.190.3.42:443/mgmt/tm/ltm/pool/~Project_52509f12fe4640b88377df5d8c1f10ad~Project_0ddb2ec1-bf40-439d-966b-80e2d6c54edf/members/~Project_52509f12fe4640b88377df5d8c1f10ad~10.22.22.6:80/ Text: u'{"code":400,"message":"\\"address\\" may not be specified in the context of the \\"modify\\" command. \\"address\\" may be specified using the following commands: create, list, show","errorStack":[]}'

Could open a new ticket if it's need necessary to.

ssorenso commented 7 years ago

Another way of causing this same error is if you build member objects faster in neutron than the agent can keep up. That is to say go from creating:

Faster than the polling rate for the data from neutron. This could be re-created by changing the periodic_interval to a larger number of seconds than you wait between creation commands. Therefore, this bug can be reproduced without shutting down the agent.

With this re-create, I can verify that the object can be destroyed (thus the provisioning status is cleared.). However, in the 3 times that I've recreated this problem, the only provisioning status that has been set to ERROR is the loadbalancer and it has only been set the first time I created this state (I should add, that I saw...).

richbrowne commented 6 years ago

We should verify this is fixed as a part of agent resiliency.