Human readable protocol is hard to parse programmatically on error

majelbstoat commented 11 years ago

Hi,

The protocol for bloomd, while very easy to read (for an English-speaker ;) ), makes it hard to program against for some reasonable use cases.

A concrete example:

I'm implementing a "set this key, and automatically create the filter if it doesn't exist, then retry" method for node-bloomd. This will important for our use case, and for incrementally creating bloom filters on-demand instead of provisioning them all up front.

The response I get back from the server, when the filter doesn't exist is "Filter does not exist". String matching against this is flaky, and does not fill me with confidence :) What if you changed it to "Filter doesn't exist"? :)

Another example: it is hard(er than it should be) to differentiate a bulk/multi response which is a single line of words, from an error, which is also a single line of words.

I'd like to propose that errors come back with a distinct first character, an error code, and then the human readable part. For example:

"_42: Filter does not exist" "_17: Filter is being deleted"

This would balance the needs of being able to read the response (which is very helpful in debugging), with the ability to respond programmatically to error cases where appropriate.

I don't care about the specific syntax really, only that it be reasonably parseable in a sane manner :)

Thanks for your consideration!

Cheers,

Jamie.

armon commented 11 years ago

Hey Jamie,

I don't think that parsing the protocol is very difficult for machines. Differentiating between a response and error is pretty simple.

For the first case, I'm not sure how "Filter does not exist" is more difficult to parse than "Filter doesn't exist"? Both are effectively the same.

In the case of bulk/multi vs error, if you check for the "Yes" or "No" prefix, this always works. No error starts with either of those.

I understand the desire to make the protocol faster or more machine readable, but unfortunately all these changes would break client compatibility without dramatically increasing parse speed.

I think we should definitely keep the error prefix in mind if we design a version 2 of the
protocol, but for the current version I'd like to maintain backwards compatibility.

Please let me know if this all makes sense!

Best Regards,

Armon Dadgar

On Friday, July 26, 2013 at 2:47 PM, Jamie Talbot wrote:

Hi,
The protocol for bloomd, while very easy to read (for an English-speaker ;) ), makes it hard to program against for some reasonable use cases. A concrete example: I'm implementing a "set this key, and automatically create the filter if it doesn't exist, then retry" method for node-bloomd. This will important for our use case, and for incrementally creating bloom filters on-demand instead of provisioning them all up front. The response I get back from the server, when the filter doesn't exist is "Filter does not exist". String matching against this is flaky, and does not fill me with confidence :) What if you changed it to "Filter doesn't exist"? :) Another example: it is hard(er than it should be) to differentiate a bulk/multi response which is a single line of words, from an error, which is also a single line of words. I'd like to propose that errors come back with a distinct first character, an error code, and then the human readable part. For example: "_42: Filter does not exist" "_17: Filter is being deleted"
This would balance the needs of being able to read the response (which is very helpful in debugging), with the ability to respond programmatically to error cases where appropriate. I don't care about the specific syntax really, only that it be reasonably parseable in a sane manner :) Thanks for your consideration! Cheers, Jamie.

— Reply to this email directly or view it on GitHub (https://github.com/armon/bloomd/issues/20).

majelbstoat commented 11 years ago

"Filter doesn't exist" isn't harder to parse than "Filter does not exist"; I was just making the point that if you, as the developer, decided to change that error message at any point (which would be reasonable if you considered, like I did, that it was just a human readable message), functionality which relied on parsing that exact string would break. If you consider the error messages to be a strict part of the protocol and can guarantee that they won't change for any reason (until an explicit protocol change), it's less of a problem.

A concrete example: In the case of a "create" closely following a "drop", if I get the response "Filter is being deleted", I want to retry my creation until it succeeds. Parsing for that entire error string is inelegant :)

Finally, I guess my point is that because the error messages aren't fully enumerated in the documentation, it is hard to write a generalised error handler. And, in order to write smarter clients, we need to be able to easily understand what is coming back, so we can respond appropriately.

Explicitly enumerating all the possible error messages in the documentation would be useful.

I absolutely understand your desire not to change the protocol ad hoc. But, if you do consider a protocol upgrade at any point, please take this into account :)

Thanks for engaging!

Best,

Jamie.

armon commented 11 years ago

I see what you are saying. I currently consider all the error messages to be part of the protocol (although not thoroughly documented), and they will not change unless a new protocol version is introduced.

I will try to update the documentation soon and fully document the possible errors. If you want to look at them now, you can see all the possible responses here: https://github.com/armon/bloomd/blob/master/src/bloomd/handler_constants.c

In regards to the "Filter is being deleted", there is no good way unfortunately other than retrying the create. This has to do with the internal concurrency model. The "drop" command does not block, and as a side effect the filter may exist internally still (although it is not visible).

I will certainly take into account the error parsing for a new version as well. Thanks for reaching out!

Best Regards,

Armon Dadgar

On Saturday, July 27, 2013 at 3:23 PM, Jamie Talbot wrote:

"Filter doesn't exist" isn't harder to parse than "Filter does not exist"; I was just making the point that if you, as the developer, decided to change that error message at any point (which would be reasonable if you considered, like I did, that it was just a human readable message), functionality which relied on parsing that exact string would break. If you consider the error messages to be a strict part of the protocol and can guarantee that they won't change for any reason (until an explicit protocol change), it's less of a problem. A concrete example: In the case of a "create" closely following a "drop", if I get the response "Filter is being deleted", I want to retry my creation until it succeeds. Parsing for that entire error string is inelegant :) Finally, I guess my point is that because the error messages aren't fully enumerated in the documentation, it is hard to write a generalised error handler. And, in order to write smarter clients, we need to be able to easily understand what is coming back, so we can respond appropriately. Explicitly enumerating all the possible error messages in the documentation would be useful. I absolutely understand your desire not to change the protocol ad hoc. But, if you do consider a protocol upgrade at any point, please take this into account :) Thanks for engaging! Best, Jamie.

— Reply to this email directly or view it on GitHub (https://github.com/armon/bloomd/issues/20#issuecomment-21673613).

majelbstoat commented 11 years ago

Thanks for that :) handler_constants is a helpful place to start!

armon / bloomd

Human readable protocol is hard to parse programmatically on error #20