Need better logging and error reporting

zerovm / zerocloud

Swift middleware for Zerocloud

Apache License 2.0

53 stars 14 forks source link

Need better logging and error reporting #106

Open larsbutler opened 10 years ago

larsbutler commented 10 years ago

When working with local deployments of ZeroCloud (on Devstack), it's difficult to troubleshoot issues. Even simple app execution, for example, can fail but there is no information in the logs to indicate that there is actually a problem. I suspect that errors are occurring and being masked/swallowed somehow.

pkit commented 10 years ago

Need to have some examples. If the pipeline is configured correctly all uncaught exceptions are passed as HTTP 500 with stack-trace as body. And for caught ones there is usually either log entry or specific error returned to user. Failures in applications themselves can be traced by using X-Nexe-Retcode for example. And there are some limitations on how far can platform intervene into execution flow, for example if your app wrote something to stderr and died, and stderr was not connected to any channel, it will be lost. And you will never know that.

larsbutler commented 10 years ago

Yeah, I don't have any specific examples, unfortunately. At the very least, we could have better internal logging (with more INFO and DEBUG level statements).

As for reporting errors, I was using Devstack with the latest ZeroCloud code the other day, and couldn't get any output from a zapp. It was just a simple hello world, but it just gave no output and returned a 204. I've seen users report this on the list before, and it's really hard to troubleshoot, even if you have access to the Devstack and system logs (no errors are reported).

pkit commented 10 years ago

You need to look into X-Nexe-* headers to get some understanding. Unfortunately HTTP does not have in-band and out-of-band channels, thus the only way to get something that's out-of-band is to look at the headers. For example using X-Nexe-Cdr you can clearly see if your program ever wrote something to any channel, for example.

larsbutler commented 10 years ago

Yeah, I did look at the headers in the response; there wasn't anything like that. All I got was a 204 and some very basic headers. I'll reproduce the issue and post header details here.

larsbutler commented 10 years ago

(So that means that it could be a configuration error, but I don't know because there are no error messages from Devstack or ZeroCloud.)

pkit commented 10 years ago

If you don't have Zerocloud proxy middleware in place, you will get 204 and success for any POST you do. Because vanilla Swift will just ignore all headers it does not know, and because there were no other headers it will not complain that anything is wrong. I don't think we can solve that one easily...

larsbutler commented 10 years ago

Ah, okay, I'll double check my config. I'm going by my own instructions in Hacking.md, so that may need some work.

larsbutler commented 10 years ago

So the particular issue I had was a misconfiguration. But the core issue still stands: we need better logging and debugging information.

pkit commented 10 years ago

Quite a lot of that was fixed in recent patches. where 5xx errors were introduces in unambiguous cases, and client HTTP codes were propagated to final response codes. What's left is maybe a guideline for app developer on how to catch app errors better.