mailgun / kafka-pixy

gRPC/REST proxy for Kafka
Apache License 2.0
768 stars 119 forks source link

Add support for Kafka record headers #144

Closed evan-stripe closed 6 years ago

evan-stripe commented 6 years ago

This adds support for specifying and receiving Kafka record headers (added in Kafka 0.11) to both the HTTP and gRPC interfaces. For gRPC headers are simply added as an additional field to the ProdRq and ConsRs structs - easy. For consuming over HTTP, they're added as an additional field - also easy.

For producing over HTTP, the HTTP server looks for any HTTP header with the prefix "X-Kafka-". It strips that prefix to get the Kafka record header key, and treats the HTTP header value as a base 64 encoded value.

I'm open to feedback on the interfaces here - a few things that I rejected but I could be wrong about:

Record headers are only supported on Kafka version 0.11 or later, so if the Version config variable isn't set to something sufficiently new, Sarama will silently ignore headers. It might be desirable to return an error if the cluster configuration doesn't support producing messages with headers, but it doesn't seem like that's currently threaded through producer.T or consumer.T, so I'm not sure how to detect that case.

mailgun-ci commented 6 years ago

Can one of the admins verify this patch?

evan-stripe commented 6 years ago

Hmm. I'm having a hard time figuring out why the tests keep failing, but it seems like timeouts talking to zookeeper rather than anything that I've changed. Not sure what to do about that.

horkhe commented 6 years ago

~Instead of passing Kafka headers as HTTP headers in the produce API I would rather make them a header parameter.~ Scratch that. It does make sense to pass headers as... well headers.

horkhe commented 6 years ago

There is no consume with headers test in service_http_test.go. Preferably tests in service_http_test.go and service_grpc_test.go should be similar.

horkhe commented 6 years ago

I would prefer the produce API to fail if a header is passed, but the cluster does not support them. I think we can add a check here and here and return an error.

horkhe commented 6 years ago

@evan-stripe overall I like it. Just a few minor and easily fixable comments. Thank you for your contribution! @thrawn01 do you have any comments?

evan-stripe commented 6 years ago

Thanks for the feedback! I think I've incorporated all of your requests into the latest version of the patch, but let me know if I've missed anything.

horkhe commented 6 years ago

Two very minor comments. Otherwise I am ready to merge. But before doing that I would like another pair of eyes primarily on the HTTP interface, the GRPC is solid. @thrawn01 please take a look.

horkhe commented 6 years ago

Ough, almost forgot, if you could update the README and CHANGELOG that would be great. But let's do that after @thrawn01 gives his go on the changes.

horkhe commented 6 years ago

Maybe we should say that headers are expected to be utf-8 strings and if a use case demands binary then it is the clients responsibility to base64 or whatever? Just thinking out loud, I am not against base64 encoding.

evan-stripe commented 6 years ago

I think Sarama may actually be wrong to expose the header name as a []byte instead of a string. Both the Java API and the KIP use a string: https://kafka.apache.org/11/javadoc/org/apache/kafka/common/header/Header.html, and Kafka protocol strings are decoded as UTF-8.

So, I think we can definitely get away with using strings in the HTTP publish interface. I'm now wondering whether we should change the GRPC interfaces and the HTTP consumer interface to use strings as well. Thoughts?

horkhe commented 6 years ago

Given what you said about the Java implementation, I vote for headers to be strings all the way through.

evan-stripe commented 6 years ago

Alright, updated everything to be strings.

thrawn01 commented 6 years ago

This is really awesome! I tested it out and added support for it in the CLI (See #147)

I ran into some usability issues though that might throw some.

Neither of these are blocking issues, and probably isn't anything we would fix.

evan-stripe commented 6 years ago

Hmm. Given that values are arbitrary byte arrays (not necessarily UTF-8), I'm not entirely sure how to make the types more ergonomic here; I think you have to base64-encode by default.

Totally agreed on the ergonomics of the version number, though - I ran into this when I was originally developing this feature. We hadn't bumped the version setting from the default, and the patch we're running internally didn't have the error message, and it was really challenging to debug. (It'd be super nice if Sarama actually used the ApiVersions API to negotiate the protocol versions, instead of requiring them to be hardcoded.)

There is a version number option that you can put in the config file, though. One option would be to make the error message more verbose; maybe something like "headers are not supported with this version of kafka; update proxies.kafka.version in your config file or set $KAFKA_VERSION", but that's pretty wordy for a Go error message.

horkhe commented 6 years ago
horkhe commented 6 years ago

Just wanted to comment but pressed the close button by accident, sorry... I told myself again and again do not work until you have your morning coffee...

mailgun-ci commented 6 years ago

Can one of the admins verify this patch?

horkhe commented 6 years ago

Please rebase and I will merge.

horkhe commented 6 years ago

...and now that we agreed on API, please, update the README and CHANGELOG.

evan-stripe commented 6 years ago

Alright, I took a stab at updating the README and the CHANGELOG

horkhe commented 6 years ago

@evan-stripe looks like a test needs to be fixed:

service_grpc_test.go:237:
    c.Check(grpcStatus.Message(), Equals, "headers are not supported with this version of Kafka")
... obtained string = "headers are not supported with this version of Kafka. Consider changing `kafka.version` (https://github.com/mailgun/kafka-pixy/blob/master/default.yaml#L35)"
... expected string = "headers are not supported with this version of Kafka"
evan-stripe commented 6 years ago

Bah that's what I get for making the error messages better 😛

Fix coming momentarily...

evan-stripe commented 6 years ago

:tada:

Thanks for all the feedback and help!