spring-cloud / spring-cloud-stream

Framework for building Event-Driven Microservices
http://cloud.spring.io/spring-cloud-stream
Apache License 2.0
1.01k stars 614 forks source link

NPE when sending heterogeneously-typed Arrays via StreamBridge. #2934

Closed martyschaer closed 7 months ago

martyschaer commented 7 months ago

I believe this is a bug, if it's a misconfiguration, unsupported use, or the wrong Spring project please let me know.

Versions

This issue popped up for us after upgrading from

To Reproduce

see comment below

Problem

There is an abstract class ActionBase that is inherited by multiple subclasses (e.g. ActionCancellation, ActionSpecialStop) in our model of which we want to send multiple in a single message.

When attempting to send a Message<ActionBase[]> via StreamBridge (to the below mentioned parseActionconvertAction-out-0), we encounter a NullPointerException in SimpleFunctionRegistry#isExtractPayload (L1179) (see attached stacktrace.txt).

The problem does not occur when sending a homogeneously-typed Array via StreamBridge (e.g. a Message<ActionBase[]> containing two ActionCancellations).

As far as I can tell, the method attempts to determine whether a Message contains multiple Messages. It tries to find the common type of the elements of the contained Collection/Array using CollectionUtils.findCommonElementType. This returns null, because the collection contains e.g. an ActionCancellation and ActionSpecialStop, yielding no common type. Class#isAssignableFrom then throws an NPE when given null.

Expected behaviour

It's possible to send Message containing heterogeneously-typed Arrays via StreamBridge. I suspect Message.class.isAssignableFrom(CollectionUtils.findCommonElementType((Collection<?>) payload)) of SimpleFunctionRegistry#isExtractPayload should yield false when there are no common element types.

Context

Some context around our use-case. We also receive Actions via Tibco-Rendezvous, transform them to a different model, then send them via Kafka. This works fine.

The binding is configured as:

spring.cloud.stream.bindings:
  parseActionconvertAction-in-0:
    destination: TIBRV.INPUT.SUBJECT
    contentType: application/x-techstack-message-tibrv
  parseActionconvertAction-out-0:
    destination: kafka-output-topic.action
    content-type: application/x-java-object;type=com.example.export.model.ActionBase
    producer:
      error-channel-enabled: true
    binder: kafka

The function definition is as:

spring.cloud.stream.function.definition: "parseAction|convertAction"

where

@Bean
public Function<TibrvMsg, Collection<InternalActionBase>> parseAction() { ... }

@Bean
public Function<Collection<InternalActionBase>, Message<ActionBase[]>>() { ... }

The Kafka-Binder is then configured to use io.confluent.kafka.serializers.json.KafkaJsonSchemaSerializer.

Thank you for taking the time to read this :)

sobychacko commented 7 months ago

@martyschaer It sounds like this didn't use to happen before the upgrade to 4.1.1. If true, that must be a regression. Please create a minimally reproducible sample with as few external dependences as possible so we can triage the issue. Thanks!

martyschaer commented 7 months ago

@sobychacko thanks for the quick response! I've built a reproducer here.

The test class dev.schaer.reproducer_2934.infra.ActionSenderTest shows the behaviour.

ActionSenderTest#testSendActionsHomogeneous() works as expected, sending/receiving the actions. ActionSenderTest#testSendActionsHeterogeneous() fails, throwing a NullPointerException.

martyschaer commented 7 months ago

Upon further debugging, I believe the issue was masked previously, due to the StreamBridge registering itself as Function<Message<Object>, Message<Object>>, where now it's registered as Function<Object, Object>.

The code would then exit isExtractPayload after FunctionTypeUtils.isMessage(type), never reaching the check that causes the NPE.

sobychacko commented 7 months ago

@martyschaer Thanks for the sample. The issue turned out to be a small bug in SCF. See this issue there: https://github.com/spring-cloud/spring-cloud-function/issues/1134.

The fix is in SCF main (4.1.2-SNAPSHOT) and has been back-ported to the 4.0.x branch. I re-ran your test with the 4.1.2-SNAPSHOT, which now passes.

Once you confirm, please feel free to close this issue.

martyschaer commented 7 months ago

@sobychacko Thank you for the very quick fix! It works for our real project as well.

When is SCF 4.1.2 expected to release? We have release on Friday, I suspect that's too soon?

We can release with a work-around, but if SCF 4.1.2 is going to become available, we'd prefer to use that instead.

sobychacko commented 7 months ago

Ya, that is too soon. The next spring-cloud release (2023.0.2) is scheduled for May 30th. https://calendar.spring.io/