redpanda-data / redpanda

Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
https://redpanda.com
9.44k stars 579 forks source link

CI Failure (Timed out for transform verifier to complete) in `DataTransformsTest.test_identity` #17903

Open vbotbuildovich opened 5 months ago

vbotbuildovich commented 5 months ago

https://buildkite.com/redpanda/redpanda/builds/47842

Module: rptest.tests.data_transforms_test
Class: DataTransformsTest
Method: test_identity
Arguments: {
    "wait_running": false,
    "transactional": true
}
test_id:    DataTransformsTest.test_identity
status:     FAIL
run time:   57.941 seconds

TimeoutError('Timed out for transform verifier to complete TransformVerifierService-1-139925138929984, latest status: None')
Traceback (most recent call last):
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/tests/runner_client.py", line 184, in _do_run
    data = self.run_test()
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/tests/runner_client.py", line 276, in run_test
    return self.test_context.function(self.test)
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/mark/_mark.py", line 535, in wrapper
    return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
  File "/root/tests/rptest/services/cluster.py", line 103, in wrapped
    r = f(self, *args, **kwargs)
  File "/root/tests/rptest/tests/data_transforms_test.py", line 256, in test_identity
    consumer_status = self._consume_output_topic(topic=self.topics[1],
  File "/root/tests/rptest/tests/data_transforms_test.py", line 202, in _consume_output_topic
    result = TransformVerifierService.oneshot(
  File "/root/tests/rptest/services/transform_verifier_service.py", line 191, in oneshot
    service.wait(timeout_sec=timeout_sec)
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/services/service.py", line 287, in wait
    if not self.wait_node(node, end - now):
  File "/root/tests/rptest/services/transform_verifier_service.py", line 254, in wait_node
    wait_until(
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/utils/util.py", line 57, in wait_until
    raise TimeoutError(err_msg() if callable(err_msg) else err_msg) from last_exception
ducktape.errors.TimeoutError: Timed out for transform verifier to complete TransformVerifierService-1-139925138929984, latest status: None

JIRA Link: CORE-2397

rockwotj commented 5 months ago
[DEBUG - 2024-04-16 06:52:43,428 - transform_verifier_service - _get_status_for_node - lineno:352]: Status endpoint TransformVerifierService-1-139925138929984 response: TransformVerifierConsumeStatus(latest_seqnos={'0': 114, '1': 114, '3': 114, '4': 114, '5': 114, '6': 114, '7': 114, '8': 114}, invalid_records=0, error_count=0)

Partition 2 never transforms any records, previously this was assumed to be slow debug builds, but now that we're on -O1 I'm a bit surprised to see this. I'll dig in more

rockwotj commented 5 months ago
TRACE 2024-04-16 06:52:13,403 [shard 0:tran] transform - identity-xform/2 - transform_processor.cc:283 - consumed up to offset 170
TRACE 2024-04-16 06:52:13,570 [shard 0:tran] transform - identity-xform/2 - transform_processor.cc:388 - committing progress 169 for output topic 0

Something is certainly wrong here, it looks like the transforms properly committed progress and successfully wrote the output.

rockwotj commented 5 months ago

I will add debugging to kgo on the verifier to see what's going on.

vbotbuildovich commented 4 months ago

*https://buildkite.com/redpanda/redpanda/builds/48752

vbotbuildovich commented 3 months ago

*https://buildkite.com/redpanda/redpanda/builds/49631

vbotbuildovich commented 3 months ago

*https://buildkite.com/redpanda/redpanda/builds/49723

vbotbuildovich commented 3 months ago

https://buildkite.com/redpanda/redpanda/builds/49769 https://buildkite.com/redpanda/redpanda/builds/49883

vbotbuildovich commented 3 months ago

*https://buildkite.com/redpanda/redpanda/builds/49883

vbotbuildovich commented 3 months ago

*https://buildkite.com/redpanda/redpanda/builds/49883

vbotbuildovich commented 3 months ago

*https://buildkite.com/redpanda/redpanda/builds/49883

vbotbuildovich commented 3 months ago

https://buildkite.com/redpanda/redpanda/builds/49883 https://buildkite.com/redpanda/redpanda/builds/50160

github-actions[bot] commented 1 week ago

This issue hasn't seen activity in 3 months. If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in two weeks.