Open termie opened 13 years ago
(by ewanmellor) Just "raise" without the "e" will reraise the current exception and keep the stack trace intact.
-----Original Message----- From: bounces@canonical.com [mailto:bounces@canonical.com] On Behalf Of justinsb Sent: 24 March 2011 05:04 To: Ewan Mellor Subject: [Bug 741468] [NEW] libvirt error handling is poor
Public bug reported:
The error handling in libvirt_conn is not great; there's a lot of catching of generic exceptions and assuming that any error is the error we think it's likely to be.
I hit this one hard when testing my 'don't erase all instances on host reboot patch'. For each instance, it called get_info, assuming that if this throws an exception that means the VM has been deleted. It then marks the instance deleted in the DB.
but libvirt wasn't started. (I don't know why, something was screwy with my upstart). Every get_info failed with a connection error, and every instance was removed. Not good.
I'm changing the generic "except:" handler to something like this:
try: virt_dom = self._conn.lookupByName(instancename) except libvirt.libvirtError as e: LOG.warning(("Error from libvirt during lookup: %s") % e) if e.get_error_code() == libvirt.VIR_ERR_UNKNOWNHOST: raise exception.NotFound(("Instance %s not found") % instance_name)
TODO(justinsb): How do I raise this and keep the stack
trace? raise e
(Though I haven't tested that code yet)
We probably need to go through every catch block in libvirt_conn and do something similar. We could probably use a helper function here to make this less painful, though it might be less readable.
\ Affects: nova Importance: Undecided Status: New
You received this bug notification because you are subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/741468
Title: libvirt error handling is poor
Status in OpenStack Compute (Nova): New
Bug description: The error handling in libvirt_conn is not great; there's a lot of catching of generic exceptions and assuming that any error is the error we think it's likely to be.
I hit this one hard when testing my 'don't erase all instances on host reboot patch'. For each instance, it called get_info, assuming that if this throws an exception that means the VM has been deleted. It then marks the instance deleted in the DB.
but libvirt wasn't started. (I don't know why, something was screwy with my upstart). Every get_info failed with a connection error, and every instance was removed. Not good.
I'm changing the generic "except:" handler to something like this:
try: virt_dom = self._conn.lookupByName(instancename) except libvirt.libvirtError as e: LOG.warning(("Error from libvirt during lookup: %s") % e) if e.get_error_code() == libvirt.VIR_ERR_UNKNOWNHOST: raise exception.NotFound(("Instance %s not found") % instance_name)
TODO(justinsb): How do I raise this and keep the stack
trace? raise e
(Though I haven't tested that code yet)
We probably need to go through every catch block in libvirt_conn and do something similar. We could probably use a helper function here to make this less painful, though it might be less readable.
(by justin-fathomdb) Thanks Ewan. I've been told that eventlet breaks that though...?
I did find a 3-arg raise command, but it was a little bit too much Python for me to get my head around.
(by blamar) From what I understand eventlet comes into play only if you explicitly call it during error handling (as in https://bugs.launchpad.net/glance/+bug/726742).
In this case I think the no-arg raise should work, since sys.exc_clear()
isn't ever called from eventlet.
(by rconradharris)
From what I understand eventlet comes into play only if you explicitly call it during error handling
Exactly right.
I did find a 3-arg raise command
I initially used a 3-arg raise in Glance to work around the problem. It worked beautifully, just one problem though, it's not pep-8 compat. After a few weeks of suffering though complaints, Jaypipes just made this change:
(by justin-fathomdb) Thanks for the explanation all.
I always like to think of the future evolution of the code, so while this might be safe today:
except libvirt.libvirtError as e: LOG.warning(_("Error from libvirt during lookup: %s") % e) raise
...it sounds like if we plugged in a centralized logging solution that fired off HTTP messages using eventlet (or even wrote to files using eventlet?), that this would no longer be safe.
But having seen the alternative (thanks Rick & Brian for the info) I think maybe it's better just to live with the no-arg throw... If we ever plug in an logger that clears exceptions, we can probably put the responsibility for making sure that the exceptions aren't swallowed on that code.
For anyone reading this bug that's a little confused, this discussion was about my TODO note. The "catch-all" libvirt error handling is still problematic.
The error handling in libvirt_conn is not great; there's a lot of catching of generic exceptions and assuming that any error is the error we think it's likely to be.
I hit this one hard when testing my 'don't erase all instances on host reboot patch'. For each instance, it called get_info, assuming that if this throws an exception that means the VM has been deleted. It then marks the instance deleted in the DB.
but libvirt wasn't started. (I don't know why, something was screwy with my upstart). Every get_info failed with a connection error, and every instance was removed. Not good.
I'm changing the generic "except:" handler to something like this:
try: virt_dom = self._conn.lookupByName(instancename) except libvirt.libvirtError as e: LOG.warning(("Error from libvirt during lookup: %s") % e) if e.get_error_code() == libvirt.VIR_ERR_UNKNOWNHOST: raise exception.NotFound(("Instance %s not found") % instance_name)
TODO(justinsb): How do I raise this and keep the stack trace?
(Though I haven't tested that code yet)
We probably need to go through every catch block in libvirt_conn and do something similar. We could probably use a helper function here to make this less painful, though it might be less readable.
Imported from Launchpad using lp2gh.