antrea-io / antrea

Kubernetes networking based on Open vSwitch
https://antrea.io
Apache License 2.0
1.65k stars 362 forks source link

TestFlowAggregatorSecureConnection/https failed with empty error message #6478

Open antoninbas opened 2 months ago

antoninbas commented 2 months ago

I just saw a failure in CI for TestFlowAggregatorSecureConnection/https I wanted to look into it but then realized that the necessary information for troubleshooting was missing. Here are the test logs:

2024-06-21T20:07:54.2501752Z === RUN   TestFlowAggregatorSecureConnection/https
2024-06-21T20:07:54.2502651Z 2024/06/21 20:07:54 Applying Antrea YAML
2024-06-21T20:07:55.9366823Z 2024/06/21 20:07:55 Waiting for all Antrea DaemonSet Pods
2024-06-21T20:07:56.9442936Z 2024/06/21 20:07:56 Checking CoreDNS deployment
2024-06-21T20:07:56.9460675Z     fixtures.go:280: Creating 'testflowaggregatorsecureconnection-https-aksh9d2l' K8s Namespace
2024-06-21T20:07:59.0075480Z     fixtures.go:306: Deploying ClickHouse
2024-06-21T20:08:36.0545187Z I0621 20:08:36.054198   25834 framework.go:935] Successfully connected to clickhouse Service
2024-06-21T20:08:36.0573680Z     fixtures.go:311: ClickHouse Service created with ClusterIP: 10.96.34.148
2024-06-21T20:08:36.0575842Z     fixtures.go:312: Applying flow aggregator YAML with ipfix collector: [fc00:f853:ccd:e793::4]:4739:tcp and clickHouse enabled
2024-06-21T20:08:55.5860178Z     flowaggregator_test.go:1597: Network Policies are realized.
2024-06-21T20:09:17.4737489Z     flowaggregator_test.go:1466: 
2024-06-21T20:09:17.4739173Z            Error Trace:    /home/runner/work/antrea/antrea/test/e2e/flowaggregator_test.go:1466
2024-06-21T20:09:17.4741456Z                                        /home/runner/work/antrea/antrea/test/e2e/flowaggregator_test.go:1007
2024-06-21T20:09:17.4743422Z                                        /home/runner/work/antrea/antrea/test/e2e/flowaggregator_test.go:1002
2024-06-21T20:09:17.4745428Z                                        /home/runner/work/antrea/antrea/test/e2e/flowaggregator_test.go:310
2024-06-21T20:09:17.4747478Z                                        /home/runner/work/antrea/antrea/test/e2e/flowaggregator_test.go:248
2024-06-21T20:09:17.4748711Z            Error:          Received unexpected error:
2024-06-21T20:09:17.4749773Z                            context deadline exceeded
2024-06-21T20:09:17.4751759Z            Test:           TestFlowAggregatorSecureConnection/https
2024-06-21T20:09:17.4752610Z            Messages:       
2024-06-21T20:09:17.4870749Z     fixtures.go:347: Exporting test logs to '/home/runner/work/antrea/antrea/log/TestFlowAggregatorSecureConnection_https/beforeTeardown.Jun21-20-09-17'
2024-06-21T20:09:21.0653709Z     fixtures.go:518: Deleting 'testflowaggregatorsecureconnection-https-aksh9d2l' K8s Namespace

Notice how Messages: is empty, even though according to the test code it should include the content of all IPFIX records received by the collector: https://github.com/antrea-io/antrea/blob/0f75c5062b037a407f966d7111976d5fa825cdbe/test/e2e/flowaggregator_test.go#L1466

After checking the testify source code, I realized that they are using bufio.Scanner to print formatted messages: https://github.com/stretchr/testify/blob/bb548d0473d4e1c9b7bbfd6602c7bf12f7a84dd2/assert/assertions.go#L304-L313

bufio.Scanner has a max token size of MaxScanTokenSize = 64 * 1024, and in this case a token should correspond to a line of text. We are probably trying to print a line that is longer than that. If recordSlices is formatted as one line of text, this is likely to be the case. The result is that nothing is printed.

Unfortunately, this makes the error message useless for troubleshooting the test failure. The test code should be updated so that we only try to print the relevant information, formatted in a helpful way, and possibly not through the testify assertions.

EraKin575 commented 1 month ago

Hi, @antoninbas! I am new to antrea. I wish to work on this issue. Can you please assign me this issue?

antoninbas commented 1 month ago

@EraKin575 you are welcome to take a stab at this. However, I don't know if this is a great issue for a first-time contributor.

EraKin575 commented 1 month ago

I will take a try at this