redpanda-data / redpanda

Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
https://redpanda.com
9.65k stars 589 forks source link

fix "Unexpected EOF" log errors #24110

Open oleiman opened 3 days ago

oleiman commented 3 days ago

This PR introduces bool net::server::is_message_error(exception_ptr) (naming 😕), a virtual function whose base implementation returns false unconditionally but can be overloaded by server implementations to describe exception types which are "application specific".

It is called in server::print_exceptional_future - if true, the exception is logged at WARN level rather than the usual ERROR.

This PR also updates kafka::server with a new exception type malformed_header_exception, classified as a "message error", and modifies kafka::parse_v1_header to raise one of those rather than a std::runtime_error when we receive a junk header from a client, which happens occasionally and previously would fail log ERROR checks in ducktape.

In principle, fixes:

CORE-8134 CORE-8059 CORE-7883 CORE-7324 CORE-6895

TODO:

Backports Required

Release Notes

oleiman commented 3 days ago

/ci-repeat 1 skip-units /tests/rptest/tests/random_node_operations_test.py::RandomNodeOperationsTest.test_node_operations /tests/rptest/tests/upgrade_test.py::UpgradeBackToBackTest.test_upgrade_with_all_workloads /tests/rptest/tests/nodes_decommissioning_test.py::NodesDecommissioningTest.test_recycle_all_nodes /tests/rptest/tests/node_id_assignment_test.py::NodeIdAssignmentUpgrade.test_assign_after_upgrade /tests/rptest/tests/compaction_recovery_test.py::CompactionRecoveryUpgradeTest.test_index_recovery_after_upgrade

oleiman commented 3 days ago

/ci-repeat 1 skip-units /tests/rptest/tests/random_node_operations_test.py::RandomNodeOperationsTest.test_node_operations /tests/rptest/tests/upgrade_test.py::UpgradeBackToBackTest.test_upgrade_with_all_workloads /tests/rptest/tests/nodes_decommissioning_test.py::NodesDecommissioningTest.test_recycle_all_nodes /tests/rptest/tests/node_id_assignment_test.py::NodeIdAssignmentUpgrade.test_assign_after_upgrade /tests/rptest/tests/compaction_recovery_test.py::CompactionRecoveryUpgradeTest.test_index_recovery_after_upgrade

oleiman commented 3 days ago

/cdt provider=azure dt-repeat=20 /tests/rptest/tests/random_node_operations_test.py::RandomNodeOperationsTest.test_node_operations /tests/rptest/tests/upgrade_test.py::UpgradeBackToBackTest.test_upgrade_with_all_workloads /tests/rptest/tests/nodes_decommissioning_test.py::NodesDecommissioningTest.test_recycle_all_nodes /tests/rptest/tests/node_id_assignment_test.py::NodeIdAssignmentUpgrade.test_assign_after_upgrade /tests/rptest/tests/compaction_recovery_test.py::CompactionRecoveryUpgradeTest.test_index_recovery_after_upgrade

vbotbuildovich commented 3 days ago

'cdt_instance_type' and 'region' is required if 'provider' is not 'aws'

Workflow run logs.

oleiman commented 20 hours ago

/ci-repeat 1 skip-units skip-redpanda-build tests/rptest/tests/random_node_operations_test.py::RandomNodeOperationsTest.test_node_operations tests/rptest/tests/upgrade_test.py::UpgradeBackToBackTest.test_upgrade_with_all_workloads tests/rptest/tests/nodes_decommissioning_test.py::NodesDecommissioningTest.test_recycle_all_nodes tests/rptest/tests/node_id_assignment_test.py::NodeIdAssignmentUpgrade.test_assign_after_upgrade tests/rptest/tests/compaction_recovery_test.py::CompactionRecoveryUpgradeTest.test_index_recovery_after_upgrade

oleiman commented 20 hours ago

/ci-repeat 1 skip-units skip-redpanda-build dt-repeat=25 tests/rptest/tests/random_node_operations_test.py::RandomNodeOperationsTest.test_node_operations tests/rptest/tests/upgrade_test.py::UpgradeBackToBackTest.test_upgrade_with_all_workloads tests/rptest/tests/nodes_decommissioning_test.py::NodesDecommissioningTest.test_recycle_all_nodes tests/rptest/tests/node_id_assignment_test.py::NodeIdAssignmentUpgrade.test_assign_after_upgrade tests/rptest/tests/compaction_recovery_test.py::CompactionRecoveryUpgradeTest.test_index_recovery_after_upgrade

oleiman commented 12 hours ago

/cdt dt-repeat=5 tests/rptest/tests/random_node_operations_test.py::RandomNodeOperationsTest.test_node_operations tests/rptest/tests/upgrade_test.py::UpgradeBackToBackTest.test_upgrade_with_all_workloads tests/rptest/tests/nodes_decommissioning_test.py::NodesDecommissioningTest.test_recycle_all_nodes tests/rptest/tests/node_id_assignment_test.py::NodeIdAssignmentUpgrade.test_assign_after_upgrade tests/rptest/tests/compaction_recovery_test.py::CompactionRecoveryUpgradeTest.test_index_recovery_after_upgrade