Open miretskiy opened 8 months ago
how can I contribute to it.
Hi @vkstack! Thanks for your interests in contributing.
Do you have the crdb repo set up already? If not, this wiki is a nice place to start. Specifically, 1. https://cockroachlabs.atlassian.net/wiki/spaces/CRDB/pages/73204103/Building+from+source+on+macOS 2. https://cockroachlabs.atlassian.net/wiki/spaces/CRDB/pages/181338446/Getting+and+building+CockroachDB+from+source 3. https://cockroachlabs.atlassian.net/wiki/spaces/CRDB/pages/2221703221/Developing+with+Bazel would be useful. Let me know any links above are not public. After getting the repo set up, you should be able to run ./dev build
without any errors.
I believe the issue above is referring to the limited configuration support we have in https://github.com/cockroachdb/cockroach/blob/66f33cbcbed38df0fa95c5f17176712526f7f79a/pkg/ccl/changefeedccl/sink_kafka.go#L185-L203. Note that we support the Compression option but not CompressionLevel.
Sarama supports the CompressionLevel option here https://github.com/IBM/sarama/blob/25c9c1a880e385781e1a39b49f8e7239e3d5e729/config.go#L188-L194. You should be able to add this option to kafka_sink_config in https://github.com/cockroachdb/cockroach/blob/66f33cbcbed38df0fa95c5f17176712526f7f79a/pkg/ccl/changefeedccl/sink_kafka.go#L185-L203. After adding it, this will get populated to the sarama config we use in https://github.com/cockroachdb/cockroach/blob/66f33cbcbed38df0fa95c5f17176712526f7f79a/pkg/ccl/changefeedccl/sink_kafka.go#L1150-L1152 (this part of the work is already completed and shouldn't require any additional work from you).
Let me know if anything above is unclear or if you encounter any issues along the way!
cc @cockroachdb/cdc
@wenyihu6 Hello, I'm the one who contacted you yesterday in slack, thanks!
Hi @wenyihu6 . I want to contribute to this issue as well. Can you assign it to me? I am getting started...
P.S: Thanks for the instructions
Hi @wenyihu6 . I want to contribute to this issue as well. Can you assign it to me? I am getting started...
P.S: Thanks for the instructions
@pvinoda Hello, I already took this issue, you can find another issue, thanks!
Is the issue resolved?
Hey @miretskiy, I just happen to look into this issue, Can you please explain what Compression configs are you pointing towards, Are we talking about compression level as @wenyihu6 suggested or perhaps a different way or location where we can specify compression type that the current parser misses. Going through documentation I can only find this way kafka_sink_config='{"Compression": "GZIP" }'
Top-level compression option refers to the compression
option listed in the options table in the documentation. Today, kafka only supports expressing compression through the kafka sink config, unlike cloud storage sinks which support compression and can use the top-level option.
So today we support creating a changefeed like:
CREATE CHANGEFEED FOR TABLE tbl INTO 'kafka://...' WITH kafka_sink_config='{"Compression":"GZIP"}';
We would also like to support:
CREATE CHANGEFEED FOR TABLE tbl INTO 'kafka://...' WITH compression='gzip';
Leveraging sarama's compression levels would also be interesting, but is probably not what this ticket was originally created for given that it has been tagged as an easy good first issue.
kafka sink supports "Compression" option to kafka_sink_config, but rejects "compression" top level option. This is confusing.
At the very least, provide a nice hint if top level compression option specified for kafka; better yet, just make it work with either kafka_sink_config={Compresion ...} and with top level compression option
Jira issue: CRDB-33022
Epic CRDB-39570