Open larsbutler opened 10 years ago
Need to have some examples. If the pipeline is configured correctly all uncaught exceptions are passed as HTTP 500 with stack-trace as body. And for caught ones there is usually either log entry or specific error returned to user. Failures in applications themselves can be traced by using X-Nexe-Retcode
for example. And there are some limitations on how far can platform intervene into execution flow, for example if your app wrote something to stderr and died, and stderr was not connected to any channel, it will be lost. And you will never know that.
Yeah, I don't have any specific examples, unfortunately. At the very least, we could have better internal logging (with more INFO and DEBUG level statements).
As for reporting errors, I was using Devstack with the latest ZeroCloud code the other day, and couldn't get any output from a zapp. It was just a simple hello world, but it just gave no output and returned a 204. I've seen users report this on the list before, and it's really hard to troubleshoot, even if you have access to the Devstack and system logs (no errors are reported).
You need to look into X-Nexe-*
headers to get some understanding.
Unfortunately HTTP does not have in-band and out-of-band channels, thus the only way to get something that's out-of-band is to look at the headers. For example using X-Nexe-Cdr
you can clearly see if your program ever wrote something to any channel, for example.
Yeah, I did look at the headers in the response; there wasn't anything like that. All I got was a 204 and some very basic headers. I'll reproduce the issue and post header details here.
(So that means that it could be a configuration error, but I don't know because there are no error messages from Devstack or ZeroCloud.)
If you don't have Zerocloud proxy middleware in place, you will get 204 and success for any POST you do. Because vanilla Swift will just ignore all headers it does not know, and because there were no other headers it will not complain that anything is wrong. I don't think we can solve that one easily...
Ah, okay, I'll double check my config. I'm going by my own instructions in Hacking.md
, so that may need some work.
So the particular issue I had was a misconfiguration. But the core issue still stands: we need better logging and debugging information.
Quite a lot of that was fixed in recent patches. where 5xx errors were introduces in unambiguous cases, and client HTTP codes were propagated to final response codes. What's left is maybe a guideline for app developer on how to catch app errors better.
When working with local deployments of ZeroCloud (on Devstack), it's difficult to troubleshoot issues. Even simple app execution, for example, can fail but there is no information in the logs to indicate that there is actually a problem. I suspect that errors are occurring and being masked/swallowed somehow.