confluentinc / ksql

The database purpose-built for stream processing applications.
https://ksqldb.io
Other
78 stars 1.04k forks source link

Dereference operator (->) with a null pointer #7185

Closed rwilliams-r7 closed 2 years ago

rwilliams-r7 commented 3 years ago

I have not been able to find an answer to my question so posting it here.

I am using the EXPLODE function and the dereference operator together. Now I should start by saying nothing breaks but I am wondering if my understanding of how KSQL should be used is incorrect.

Example:

EXPLODE(DATA)->game->score->result EXPLODE(DATA)->refs->cards

Now any of these fields can be null. What I see happening is a logline with a null pointer for instance when the score is null, but the stream still processes correctly for the other parts. How should I deal with this better or is this expected? My logs get filled and it is hard to see at is going on when we do have a problem.

vcrfxia commented 3 years ago

Hi @rwilliams-r7 , as you've guessed, the behavior you've reported is expected: ksqlDB will log deserialization warnings for records that cannot be properly deserialized, such as when a nested struct is null and therefore cannot be dereferenced.

In order to reduce the volume of this logging you could:

rwilliams-r7 commented 3 years ago

@vcrfxia Thank you for your ideas. The message seems to be at the ERROR level

ERROR {"type":1,"deserializationError":null,"recordProcessingError":{"errorMessage":"Error computing

So I am unable to increase the log level. I looked into remove the nulls but if I remove the field so it does not exist then the error is populated as it is looking for the child index.

With a Kafka and Ksql cluster that has high throughput and large data sets, this will be a problem as the error stream will be filled with these null pointer-events which are not errors which in turn will put the Kafka cluster under more strain.

vcrfxia commented 3 years ago

The example error you pasted appears to be for the ksqlDB processing log. Are you saying you're also seeing that in the main server's log file, or is your question specifically about disabling logging for certain types of errors in the ksqlDB processing log?

rwilliams-r7 commented 3 years ago

yes I am seeing it in the main logs file as well. It is more around why is this counted as an error? It looks like that the Dereference operator throws a null pointer when I do not think it should. If something is missing from the use of the Dereference let's take EXPLODE(bedrooms)->drawers->tshirt-> colour any of which could be null this would end up in the main logs as an error.

colinhicks commented 2 years ago

Closing in favor of: https://github.com/confluentinc/ksql/issues/8917, which was addressed in the 0.26.0 release.