line / decaton

High throughput asynchronous task processing on Apache Kafka
Apache License 2.0
336 stars 51 forks source link

Retry fails if a non-ASCII string is used for the Kafka Record Key #126

Closed deftfitf closed 2 years ago

deftfitf commented 3 years ago

When I was verifying the operation using the retry mechanism of Decaton, I found a phenomenon that the retry was not performed in the process that expected the retry.

When sending a normal Decaton Task, we use org.apache.kafka.common.serialization.LongSerializer to serialize a long value and use it as a key for Kafka Record.

However, when the retry task is sent by com.linecorp.decaton.client.internal.DecatonTaskProducer, the long type byte string is converted to String once when retrying, and com.linecorp.decaton.client.kafka.PrintableAsciiStringSerializer Will be serialized into a byte string.

Here, if the long value is converted to a byte string and the character is extracted again, the following exception will occur because it is not an ASCII character, and the task will not be sent to the retry topic.

ERROR c.l.d.p.runtime.internal.ProcessPipeline:121 - Uncaught exception thrown by processor 
...
org.apache.kafka.common.errors.SerializationException: illegal character: ^ @

Limiting the value used for key to an ASCII string when retrying seems like an unexpected behavior.

ocadaruma commented 3 years ago

Hi, thanks for reporting the issue.

Yeah as you noticed, currently Decaton supports only String key for consumer records. Though Decaton partially works somehow even with non-String keys to just processing tasks, there are several places that assuming keys are serialized with ascii-printable String.

Supporting only ascii-printable String was a design-decision that made in early days of Decaton that all tasks are assumed to be produced by DecatonClient.

However, now Decaton is kind of "general consumer framework" as it supports to consume messages in arbitrary-format, we think we should make Decaton to support also non-String keys.

FYI There is a in-progress effort to add non-String key support for Decaton (https://github.com/line/decaton/pull/61).

deftfitf commented 3 years ago

Thank you for your confirmation.

I have understood that there are early designs that support ASCII character keys.

By migrating the Kafka Topic key to an ASCII string, our problem has been resolved. I think it would be very useful to support keys other than ASCII strings in the future, so I will watch #61.