Open ghost opened 9 years ago
If the device isn't ready yet, the command should probably be retried again later. Is there a way to signal to neutron that the command should be re-queued? Once a record is created (for whatever reason) and there isn't a corresponding object on the device, deleting the object in neutron should succeed.
We should at least set the status_description
when commands fail, so the reason is visible in the neutron commands as well as in the logs.
The user experience when encountering errors is pretty bad. I created a health monitor and pool and associated them together.
neutron lb-healthmonitor-create --delay 200 --max-retries 2 --timeout 100 --type TCP
neutron lb-pool-create --name pool1 --lb-method ROUND_ROBIN --subnet-id public-subnet --protocol TCP
neutron lb-healthmonitor-associate ... pool1
The only response I got from the lb-healthmonitor-associate
command is
Request Failed: internal server error while processing your request.
The logs have a much more helpful message
ACOSException: 1162 Invalid integer. Parameter interval in method slb.hm.create. The valid scope is 1 - 180.
It would be nice if that message found its way into neutron lb-healthmonitor-show ...
+----------------+----------------------------------------------------------------------------------------------------+
| Field | Value |
+----------------+----------------------------------------------------------------------------------------------------+
| admin_state_up | True |
| delay | 200 |
| id | 232e59c7-5bb7-4050-b3cc-a7dd3c23face |
| max_retries | 2 |
| pools | {"status": "ERROR", "status_description": null, "pool_id": "42161812-21dc-4013-80cf-ef80c0899809"} |
| tenant_id | 2d205c4f252249acaa8c1bc53b40f1dd |
| timeout | 100 |
| type | TCP |
+----------------+----------------------------------------------------------------------------------------------------+
Returning and recording nice error messages is #179
Currently, if you try to do something like
lbaas-loadbalancer-create
on a device that isn't ready yet (you just rebuilt the appliance in stack, for example), the driver throws an error but the elements are still created in Openstack. You can't delete it because it doesn't exist on the device so you're forced to manually delete the record from the DB (--good). We need to ensure that said error condition is propagated to the caller so it knows not to create DB records/etc for stuff that's broken.