nats-io / stan.go

NATS Streaming System
https://nats.io
Apache License 2.0
706 stars 117 forks source link

Real error messages for stan.Connect() function #143

Open erikbgithub opened 7 years ago

erikbgithub commented 7 years ago

As far as I can see with my limited go skills, the stan.Connect() just reraises errors from its underlying call, instead of reinterpreting them according to its context.

It would be nice if it could give more interesting error messages to the users, e.g. if it receives a INFO block from the server, it can evaluate things like server version instead of just raising a timeout error. Or if it can't read it, it can raise an error like got INFO block but can't read it: <info output printed>.

This can save hours to users trying to debug things, since for instance on a timeout error one first expects that there's a network or DNS problem, and not that server and client just have different versions and therefore can't talk to each other.

kozlovic commented 7 years ago

The ErrConnectReqTimeout that you may receive from stan.Connect() is generally that you provide a cluster name that is different than the cluster name used by your server. It could also simply mean that no server is running, so the connection request times out. So far, there is no versioning reason that would cause this error.

What's missing is some documentation ;-)

erikbgithub commented 7 years ago

@kozlovic The discussed problem here is that nats-streaming reraises errors from an underlying call instead of interpreting, handling and raising appropriately interpreted errors. This can't be fixed with documentation. ;-)

And if you really want to discuss the example by itself, okay, we can do that. The server is there, it responds with INFO {...} blocks to telnet requests, it produces logs. However that is just an example to show that the actual problem really is a problem.

So, please reopen the issue.

wallyqs commented 7 years ago

hi @erikbgithub, since the error at stan.Connect() is being returned there opaquely you can assert too for the behavior instead which is a recommended practice https://dave.cheney.net/2016/04/27/dont-just-check-errors-handle-them-gracefully

erikbgithub commented 7 years ago

@wallyqs Thanks for finding words that probably go people will understan.d The code should be handled gracefully, that's all I'm asking here. Maybe then it would be more obvious in the example and similar cases what the actual problem is.

kozlovic commented 7 years ago

@erikbgithub The code you are pointing to is when the Streaming library creates the low level NATS connection. The NATS Connect call can fail for various reasons: no server running, an authorization problem, maybe a malformed URL, etc.. At the Streaming library level, it is impossible to get more information that what is returned by NATS.

I am not sure why you are referring to INFO in a previous post. This is low level NATS protocol. Have you had a concrete issue that we could discuss and see how we could improve NATS or NATS Streaming to make it easier to the user to understand what the problem was?