F5Networks / f5-openstack-lbaasv2-driver

F5 LBaaSv2 service provider driver for OpenStack Liberty and beyond
http://clouddocs.f5.com/products/openstack/lbaasv2-driver/master/
Apache License 2.0
21 stars 37 forks source link

Database deadlock can occur when deleting a loadbalancer #472

Closed pjbreaux closed 7 years ago

pjbreaux commented 7 years ago

OpenStack Release

Found on mitaka, but likely exists in liberty as well.

Bug Severity

This is a rare occurrence.

Description

By default, a loadbalancer manages the VIP port it uses for traffic. This means part of the lb creation is creating the VIP port and allocation an IP address. Likewise, when tearing down an lb, the port is also torn down.

There are cases, such as when running tests, that we can deadlock the database by calling to delete an lb. If a test calls to delete the lb, then does not wait for the lb to be gone from the db, and goes then to actively remove the VIP port, a deadlock can occur. The lb delete operation locks a row in the lb table, which has a foreign key dependency on a port, and the test call to delete the VIP port locks a row in the ports tables. This creates the deadlock before any of the operations can complete. It is important to note that this can happen in a few situations, but this is a common one in our tests. The solution is generally just to retry the operation. If we catch the exception and retry the lb delete, this should succeed. If the lb is already gone by then, we can delete the VIP port outright.

Will update with a traceback.

pjbreaux commented 7 years ago

Traceback when trying to delete a loadbalancer in test teardown test_l7policy_rules.py::TestL7PolicyTestJSONRedirectToUrl::test_policy_redirect_url_contains:

Server log:

2017-03-10 11:13:42.783 53853 DEBUG f5lbaasdriver.v2.bigip.plugin_rpc [req-b562475c-6b68-49a2-ae25-b8816472ba35 - - - - -] f5lbaasdriver.v2.bigip.plugin_rpc.LBaaSv2PluginCallbacksRPC method loadbalancer_destroyed called with arguments (<neutron.context.Context object at 0x7054090>,) {u'loadbalancer_id': u'abf62ffa-2ba4-4e2c-9bc7-cc926979151b'} wrapper /usr/lib/python2.7/site-packages/oslo_log/helpers.py:45
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher [req-b562475c-6b68-49a2-ae25-b8816472ba35 - - - - -] Exception during message handling: (pymysql.err.InternalError) (1213, u'Deadlock found when trying to get lock; try restarting transaction') [SQL: u'DELETE FROM lbaas_loadbalancer_statistics WHERE lbaas_loadbalancer_statistics.loadbalancer_id = %(loadbalancer_id)s'] [parameters: {'loadbalancer_id': u'abf62ffa-2ba4-4e2c-9bc7-cc926979151b'}]
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher Traceback (most recent call last):
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 138, in _dispatch_and_reply
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     incoming.message))
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 185, in _dispatch
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     return self._do_dispatch(endpoint, method, ctxt, args)
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 127, in _do_dispatch
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     result = func(ctxt, **new_args)
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/oslo_log/helpers.py", line 46, in wrapper
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     return method(*args, **kwargs)
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/f5lbaasdriver/v2/bigip/plugin_rpc.py", line 271, in loadbalancer_destroyed
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     self.driver.plugin.db.delete_loadbalancer(context, loadbalancer_id)
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/neutron_lbaas/db/loadbalancer/loadbalancer_dbv2.py", line 242, in delete_loadbalancer
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     context.session.delete(lb_db)
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 502, in __exit__
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     self.rollback()
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 60, in __exit__
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     compat.reraise(exc_type, exc_value, exc_tb)
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 499, in __exit__
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     self.commit()
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 392, in commit
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     self._prepare_impl()
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 372, in _prepare_impl
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     self.session.flush()
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2027, in flush
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     self._flush(objects)
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2145, in _flush
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     transaction.rollback(_capture_exception=True)
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 60, in __exit__
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     compat.reraise(exc_type, exc_value, exc_tb)
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2109, in _flush
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     flush_context.execute()
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 373, in execute
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     rec.execute(self)
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 561, in execute
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     uow
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 249, in delete_obj
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     cached_connections, mapper, table, delete)
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 892, in _emit_delete_statements
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     c = connection.execute(statement, del_objects)
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 914, in execute
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     return meth(self, multiparams, params)
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib64/python2.7/site-packages/sqlalchemy/sql/elements.py", line 323, in _execute_on_connection
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     return connection._execute_clauseelement(self, multiparams, params)
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1010, in _execute_clauseelement
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     compiled_sql, distilled_params
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1146, in _execute_context
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     context)
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1337, in _handle_dbapi_exception
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     util.raise_from_cause(newraise, exc_info)
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/compat.py", line 200, in raise_from_cause
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     reraise(type(exception), exception, tb=exc_tb)
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1139, in _execute_context
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     context)
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line 450, in do_execute
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     cursor.execute(statement, parameters)
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 166, in execute
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     result = self._query(query)
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 322, in _query
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     conn.query(q)
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 841, in query
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     self._affected_rows = self._read_query_result(unbuffered=unbuffered)
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1029, in _read_query_result
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     result.read()
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1312, in read
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     first_packet = self.connection._read_packet()
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 991, in _read_packet
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     packet.check_error()
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 393, in check_error
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     err.raise_mysql_exception(self._data)
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher   File "/usr/lib/python2.7/site-packages/pymysql/err.py", line 107, in raise_mysql_exception
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher     raise errorclass(errno, errval)
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher DBDeadlock: (pymysql.err.InternalError) (1213, u'Deadlock found when trying to get lock; try restarting transaction') [SQL: u'DELETE FROM lbaas_loadbalancer_statistics WHERE lbaas_loadbalancer_statistics.loadbalancer_id = %(loadbalancer_id)s'] [parameters: {'loadbalancer_id': u'abf62ffa-2ba4-4e2c-9bc7-cc926979151b'}]
2017-03-10 11:13:42.920 53853 ERROR oslo_messaging.rpc.dispatcher
2017-03-10 11:13:42.948 53853 DEBUG neutron.plugins.ml2.plugin [req-b562475c-6b68-49a2-ae25-b8816472ba35 - - - - -] Deleting port 63097b67-13a8-4ca6-bc48-2dd0d8ce8748 _pre_delete_port /usr/lib/python2.7/site-packages/neutron/plugins/ml2/plugin.py:1391
2017-03-10 11:13:42.948 53853 DEBUG neutron.callbacks.manager [req-b562475c-6b68-49a2-ae25-b8816472ba35 - - - - -] Notify callbacks for port, before_delete _notify_loop /usr/lib/python2.7/site-packages/neutron/callbacks/manager.py:138
2017-03-10 11:13:42.949 53853 DEBUG neutron.callbacks.manager [req-b562475c-6b68-49a2-ae25-b8816472ba35 - - - - -] Calling callback neutron_lbaas.db.loadbalancer.loadbalancer_dbv2._prevent_lbaasv2_port_delete_callback _notify_loop /usr/lib/python2.7/site-packages/neutron/callbacks/manager.py:145
2017-03-10 11:13:42.973 53853 DEBUG neutron.callbacks.manager [req-b562475c-6b68-49a2-ae25-b8816472ba35 - - - - -] Calling callback neutron.db.l3_db._prevent_l3_port_delete_callback _notify_loop /usr/lib/python2.7/site-packages/neutron/callbacks/manager.py:145

Tempest output during same time, even though time looks off, its only the clock on the test runner:

2017-03-10 18:13:36.477 2859 INFO tempest.lib.common.rest_client [req-24bb822f-acca-45aa-a94e-947a79f3cd98 - - - - -] Request (TestL7PolicyTestJSONRedirectToUrl:_run_cleanups): 204 DELETE http://10.190.4.152:9696/v2.0/lbaas/pools/ac9113fe-8aa0-4139-a367-757a462b9d70 0.564s
2017-03-10 18:13:36.797 2859 INFO tempest.lib.common.rest_client [req-24bb822f-acca-45aa-a94e-947a79f3cd98 - - - - -] Request (TestL7PolicyTestJSONRedirectToUrl:_run_cleanups): 200 GET http://10.190.4.152:9696/v2.0/lbaas/loadbalancers/abf62ffa-2ba4-4e2c-9bc7-cc926979151b 0.319s
2017-03-10 18:13:38.132 2859 INFO tempest.lib.common.rest_client [req-24bb822f-acca-45aa-a94e-947a79f3cd98 - - - - -] Request (TestL7PolicyTestJSONRedirectToUrl:_run_cleanups): 200 GET http://10.190.4.152:9696/v2.0/lbaas/loadbalancers/abf62ffa-2ba4-4e2c-9bc7-cc926979151b 0.333s
PASSED2017-03-10 18:13:38.145 2859 INFO tempest.lib.common.rest_client [req-24bb822f-acca-45aa-a94e-947a79f3cd98 - - - - -] Request (TestL7PolicyTestJSONRedirectToUrl:tearDownClass): 404 GET http://10.190.4.152:9696/v2.0/lbaas/loadbalancers/c8afad0d-9f62-4a23-934f-efbdbdbcdc72/statuses 0.012s
2017-03-10 18:13:38.158 2859 INFO tempest.lib.common.rest_client [req-24bb822f-acca-45aa-a94e-947a79f3cd98 - - - - -] Request (TestL7PolicyTestJSONRedirectToUrl:tearDownClass): 404 GET http://10.190.4.152:9696/v2.0/lbaas/loadbalancers/12e04d2a-448a-4ded-b3cb-dbfe97385e58/statuses 0.012s
2017-03-10 18:13:38.385 2859 INFO tempest.lib.common.rest_client [req-24bb822f-acca-45aa-a94e-947a79f3cd98 - - - - -] Request (TestL7PolicyTestJSONRedirectToUrl:tearDownClass): 200 GET http://10.190.4.152:9696/v2.0/lbaas/loadbalancers/abf62ffa-2ba4-4e2c-9bc7-cc926979151b/statuses 0.226s
2017-03-10 18:13:38.963 2859 INFO tempest.lib.common.rest_client [req-24bb822f-acca-45aa-a94e-947a79f3cd98 - - - - -] Request (TestL7PolicyTestJSONRedirectToUrl:tearDownClass): 204 DELETE http://10.190.4.152:9696/v2.0/lbaas/pools/da761f8c-af9c-4b19-9cd7-b60933ff6115 0.577s
2017-03-10 18:13:39.159 2859 INFO tempest.lib.common.rest_client [req-24bb822f-acca-45aa-a94e-947a79f3cd98 - - - - -] Request (TestL7PolicyTestJSONRedirectToUrl:tearDownClass): 200 GET http://10.190.4.152:9696/v2.0/lbaas/loadbalancers/abf62ffa-2ba4-4e2c-9bc7-cc926979151b 0.195s
2017-03-10 18:13:40.274 2859 INFO tempest.lib.common.rest_client [req-24bb822f-acca-45aa-a94e-947a79f3cd98 - - - - -] Request (TestL7PolicyTestJSONRedirectToUrl:tearDownClass): 200 GET http://10.190.4.152:9696/v2.0/lbaas/loadbalancers/abf62ffa-2ba4-4e2c-9bc7-cc926979151b 0.113s
2017-03-10 18:13:40.978 2859 INFO tempest.lib.common.rest_client [req-24bb822f-acca-45aa-a94e-947a79f3cd98 - - - - -] Request (TestL7PolicyTestJSONRedirectToUrl:tearDownClass): 204 DELETE http://10.190.4.152:9696/v2.0/lbaas/listeners/244fa11d-b47d-42da-8e9f-d2176ab844be 0.703s
2017-03-10 18:13:41.089 2859 INFO tempest.lib.common.rest_client [req-24bb822f-acca-45aa-a94e-947a79f3cd98 - - - - -] Request (TestL7PolicyTestJSONRedirectToUrl:tearDownClass): 200 GET http://10.190.4.152:9696/v2.0/lbaas/loadbalancers/abf62ffa-2ba4-4e2c-9bc7-cc926979151b 0.110s
2017-03-10 18:13:42.195 2859 INFO tempest.lib.common.rest_client [req-24bb822f-acca-45aa-a94e-947a79f3cd98 - - - - -] Request (TestL7PolicyTestJSONRedirectToUrl:tearDownClass): 200 GET http://10.190.4.152:9696/v2.0/lbaas/loadbalancers/abf62ffa-2ba4-4e2c-9bc7-cc926979151b 0.104s
2017-03-10 18:13:42.632 2859 INFO tempest.lib.common.rest_client [req-24bb822f-acca-45aa-a94e-947a79f3cd98 - - - - -] Request (TestL7PolicyTestJSONRedirectToUrl:tearDownClass): 204 DELETE http://10.190.4.152:9696/v2.0/lbaas/loadbalancers/abf62ffa-2ba4-4e2c-9bc7-cc926979151b 0.437s
2017-03-10 18:13:42.720 2859 INFO tempest.lib.common.rest_client [req-24bb822f-acca-45aa-a94e-947a79f3cd98 - - - - -] Request (TestL7PolicyTestJSONRedirectToUrl:tearDownClass): 200 GET http://10.190.4.152:9696/v2.0/lbaas/loadbalancers/abf62ffa-2ba4-4e2c-9bc7-cc926979151b 0.087s
2017-03-10 18:13:43.737 2859 INFO tempest.lib.common.rest_client [req-24bb822f-acca-45aa-a94e-947a79f3cd98 - - - - -] Request (TestL7PolicyTestJSONRedirectToUrl:tearDownClass): 404 GET http://10.190.4.152:9696/v2.0/lbaas/loadbalancers/abf62ffa-2ba4-4e2c-9bc7-cc926979151b 0.016s
2017-03-10 18:13:43.773 2859 INFO tempest.lib.common.rest_client [req-24bb822f-acca-45aa-a94e-947a79f3cd98 - - - - -] Request (TestL7PolicyTestJSONRedirectToUrl:tearDownClass): 404 GET http://10.190.4.152:9696/v2.0/ports/63097b67-13a8-4ca6-bc48-2dd0d8ce8748 0.035s
/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/connectionpool.py:852: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecureRequestWarning)
2017-03-10 18:13:44.403 2859 INFO tempest.lib.common.rest_client [req-24bb822f-acca-45aa-a94e-947a79f3cd98 - - - - -] Request (TestL7PolicyTestJSONRedirectToUrl:tearDownClass): 204 DELETE http://10.190.4.152:9696/v2.0/subnets/d28596d0-4981-41da-8a08-d0dc8b607563 0.610s
2017-03-10 18:13:44.693 2859 INFO tempest.lib.common.rest_client [req-24bb822f-acca-45aa-a94e-947a79f3cd98 - - - - -] Request (TestL7PolicyTestJSONRedirectToUrl:tearDownClass): 204 DELETE http://10.190.4.152:9696/v2.0/networks/c62a2f2b-a6fc-4cb4-9b8b-1f16b7f0a286 0.290s
2017-03-10 18:13:44.773 2859 INFO tempest.lib.common.rest_client [req-24bb822f-acca-45aa-a94e-947a79f3cd98 - - - - -] Request (TestL7PolicyTestJSONRedirectToUrl:tearDownClass): 204 DELETE http://10.190.4.152:35357/v2.0/users/8adc5d7e97f64daaaf12a85a802f75d6 0.080s
2017-03-10 18:13:44.963 2859 INFO tempest.lib.common.rest_client [req-24bb822f-acca-45aa-a94e-947a79f3cd98 - - - - -] Request (TestL7PolicyTestJSONRedirectToUrl:tearDownClass): 200 GET http://10.190.4.152:9696/v2.0/security-groups?tenant_id=a82226b8c2134d1195ee1737f43d0e5b&name=default 0.188s
2017-03-10 18:13:45.181 2859 INFO tempest.lib.common.rest_client [req-24bb822f-acca-45aa-a94e-947a79f3cd98 - - - - -] Request (TestL7PolicyTestJSONRedirectToUrl:tearDownClass): 204 DELETE http://10.190.4.152:9696/v2.0/security-groups/e181e412-6ef2-4f95-aada-f7721336a683 0.217s
2017-03-10 18:13:45.259 2859 INFO tempest.lib.common.rest_client [req-24bb822f-acca-45aa-a94e-947a79f3cd98 - - - - -] Request (TestL7PolicyTestJSONRedirectToUrl:tearDownClass): 204 DELETE http://10.190.4.152:35357/v2.0/tenants/a82226b8c2134d1195ee1737f43d0e5b 0.077s
pjbreaux commented 7 years ago

This is not a test-bug. This is a product issue. @dflanigan

pjbreaux commented 7 years ago

Another way this problem can show up is doing an lb create and delete via the CLI. The deadlock does not always occur, as timing is involved, but I can get it disappear by commenting out the line:

https://github.com/F5Networks/f5-openstack-agent/blob/mitaka/f5_openstack_agent/lbaasv2/drivers/bigip/icontrol_driver.py#L1312

The update_loadbalancer_status line is duplicated in the above function, and the service object does contain an lb with PENDING_DELETE provisioning_status, so we call back to the driver twice to perform the same operation. If things move too quickly for the neutron database, we can get a deadlock.

https://github.com/F5Networks/f5-openstack-agent/blob/mitaka/f5_openstack_agent/lbaasv2/drivers/bigip/icontrol_driver.py#L1312

@richbrowne

ssorenso commented 7 years ago

I've reproduced this while using strace on both the neutron-lbaas-agent as well as f5-openstack-agent.

The logs are here in this .tar.gz.

mattgreene commented 7 years ago

Dropping priority on this issue until we resolve the RPC hang and see if the nightly becomes stable.

mattgreene commented 7 years ago

Resolved via https://github.com/F5Networks/f5-openstack-agent/pull/644