apple / swift-nio

Event-driven network application framework for high performance protocol servers & clients, non-blocking.
https://swiftpackageindex.com/apple/swift-nio/documentation
Apache License 2.0
7.96k stars 648 forks source link

Performance tests broken (always paste a result from 3rd Jan, actual run fails) #2647

Open weissi opened 8 months ago

weissi commented 8 months ago

The performance tests post bogus results (from an old run from 3rd Jan) and the actual perf build fails:

E.g.

https://ci.swiftserver.group/view/nio/job/swift-nio2-performance-prb/158/console

12:01:48 [ssh-agent] Using credentials jenkins (ssh key to connect jenkins agents)
12:01:48 The recommended git tool is: NONE
12:01:48 Also:   hudson.remoting.Channel$CallSiteStackTrace: Remote call to performance agent 1
12:01:48        at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1787)
12:01:48        at hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:356)
12:01:48        at hudson.remoting.Channel.call(Channel.java:1003)
12:01:48        at hudson.FilePath.act(FilePath.java:1230)
12:01:48        at hudson.FilePath.act(FilePath.java:1219)
12:01:48        at org.jenkinsci.plugins.gitclient.Git.getClient(Git.java:138)
12:01:48        at hudson.plugins.git.GitSCM.createClient(GitSCM.java:916)
12:01:48        at hudson.plugins.git.GitSCM.createClient(GitSCM.java:847)
12:01:48        at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1294)
12:01:48        at hudson.scm.SCM.checkout(SCM.java:540)
12:01:48        at hudson.model.AbstractProject.checkout(AbstractProject.java:1248)
12:01:48        at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:649)
12:01:48        at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:85)
12:01:48        at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:521)
12:01:48        at hudson.model.Run.execute(Run.java:1895)
12:01:48        at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:44)
12:01:48        at hudson.model.ResourceController.execute(ResourceController.java:101)
12:01:48        at hudson.model.Executor.run(Executor.java:442)
12:01:48 java.lang.NoClassDefFoundError: Could not initialize class com.sun.proxy.$Proxy10
12:01:48    at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
12:01:48    at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
12:01:48    at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
12:01:48    at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
12:01:48    at java.base/java.lang.reflect.Proxy.newProxyInstance(Proxy.java:1025)
12:01:48    at java.base/java.lang.reflect.Proxy.newProxyInstance(Proxy.java:1011)
12:01:48    at hudson.remoting.RemoteInvocationHandler.wrap(RemoteInvocationHandler.java:168)
12:01:48    at hudson.remoting.Channel.export(Channel.java:815)
12:01:48    at hudson.remoting.Channel.export(Channel.java:778)
12:01:48    at org.jenkinsci.plugins.gitclient.LegacyCompatibleGitAPIImpl.writeReplace(LegacyCompatibleGitAPIImpl.java:232)
12:01:48    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
12:01:48    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
12:01:48    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
12:01:48    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
12:01:48    at java.base/java.io.ObjectStreamClass.invokeWriteReplace(ObjectStreamClass.java:1106)
12:01:48    at java.base/java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1127)
12:01:48    at java.base/java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:345)
12:01:48    at hudson.remoting.UserRequest._serialize(UserRequest.java:263)
12:01:48    at hudson.remoting.UserRequest.serialize(UserRequest.java:272)
12:01:48    at hudson.remoting.UserRequest.perform(UserRequest.java:222)
12:01:48    at hudson.remoting.UserRequest.perform(UserRequest.java:54)
12:01:48    at hudson.remoting.Request$2.run(Request.java:377)
12:01:48    at hudson.remoting.InterceptingExecutorService.lambda$wrap$0(InterceptingExecutorService.java:78)
12:01:48    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
12:01:48    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
12:01:48    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
12:01:48    at java.base/java.lang.Thread.run(Thread.java:829)
12:01:48 Caused: java.io.IOException: Remote call on performance agent 1 failed
12:01:48    at hudson.remoting.Channel.call(Channel.java:1007)
12:01:48    at hudson.FilePath.act(FilePath.java:1230)
12:01:48    at hudson.FilePath.act(FilePath.java:1219)
12:01:48    at org.jenkinsci.plugins.gitclient.Git.getClient(Git.java:138)
12:01:48    at hudson.plugins.git.GitSCM.createClient(GitSCM.java:916)
12:01:48    at hudson.plugins.git.GitSCM.createClient(GitSCM.java:847)
12:01:48    at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1294)
12:01:48    at hudson.scm.SCM.checkout(SCM.java:540)
12:01:48    at hudson.model.AbstractProject.checkout(AbstractProject.java:1248)
12:01:48    at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:649)
12:01:48    at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:85)
12:01:48    at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:521)
12:01:48    at hudson.model.Run.execute(Run.java:1895)
12:01:48    at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:44)
12:01:48    at hudson.model.ResourceController.execute(ResourceController.java:101)
12:01:48    at hudson.model.Executor.run(Executor.java:442)
12:01:48 $ ssh-agent -k
12:01:48 unset SSH_AUTH_SOCK;
12:01:48 unset SSH_AGENT_PID;
12:01:48 echo Agent pid 7043 killed;
12:01:48 [ssh-agent] Stopped.
12:01:48 ERROR: [GitHub Commit Status Setter] - Cannot retrieve Git metadata for the build, setting build result to FAILURE
12:01:48 Setting status of 1b7ad7de4d5e5d779989f27a0ac318abc9a40fc5 to FAILURE with url https://ci.swiftserver.group/job/swift-nio2-performance-prb/158/ and message: 'Build finished. '
12:01:48 Using context: pull request validation (performance)
12:01:48 <h2>performance report</h2>
12:01:48 <p>build id: 156</p>
12:01:48 <p>timestamp: Wed Jan  3 13:45:50 UTC 2024</p>
12:01:48 <h4>results</h4>
weissi commented 8 months ago

CC @tomerd

gmilos commented 8 months ago

I just noticed that my manual tests occasionally failed. I reproduced it by running it in tight loop and eventually tracked it down to an apparent race in TCPThroughtputBenchmark:

self.context.writeAndFlush is called here[1], before context is set here[2].

This may be the reason we don't get perf test results. Not the fact that perf testing infra is broken exactly.

[1] https://github.com/apple/swift-nio/blob/82fd942745b11ccebbf0db3e9e4bf150b60e5e44/Sources/NIOPerformanceTester/TCPThroughputBenchmark.swift#L49 [2] https://github.com/apple/swift-nio/blob/82fd942745b11ccebbf0db3e9e4bf150b60e5e44/Sources/NIOPerformanceTester/TCPThroughputBenchmark.swift#L44

weissi commented 8 months ago

CC @FranzBusch for flakey test (we should track that separately)

@gmilos good catch! However whatever happens in the performance tests should not make the harness post wrong and outdated data. Also, it's posting is so quickly that I doubt it ran anything at all

gmilos commented 8 months ago

Agreed re the harness being immune to failures in the test binary.

For reference, I filed the issue about the specific test here: https://github.com/apple/swift-nio/issues/2649 //cc @FranzBusch

FranzBusch commented 8 months ago

cc @yim-lee can you help us here. The performance tests seem to broken.

yim-lee commented 8 months ago

It posts bogus results because the build doesn't get far enough to the cleanup stage to delete previous test result files, and the job is configured to post contents from the report file. I manually wiped the workspace this time. Updating the job to wipe workspace before build didn't help because it fails very early on (some Git operation), which I will look into now.

yim-lee commented 8 months ago

@tomerd fixed the issue (thanks!) and perf job is green again: https://github.com/apple/swift-nio/pull/2652