mc1arke / sonarqube-community-branch-plugin

A plugin that allows branch analysis and pull request decoration in the Community version of Sonarqube
GNU Lesser General Public License v3.0
2.22k stars 521 forks source link

Time to analyze PR branches varies wildly #110

Closed dcherniv closed 4 years ago

dcherniv commented 4 years ago

Describe the bug It appears that sonarqube with community branch plugin takes too long to analyze the PRs in some cases (unclear which).

To Reproduce Steps to reproduce the behavior:

  1. Install SQ 8.1
  2. Build plugin from 8.1 branch of this here plugin.
  3. Install plugin.
  4. Configure a project or two to scan and decorate PRs using bitbucket server.
  5. Scan multiple times, you should see varying scan time 1000x difference sometimes more.

Expected behavior Scans to complete within reasonable time.

Screenshots image

Software Versions

mc1arke commented 4 years ago

Are you able to provide any logs for the CE component for the different runs you've highlighted,? The plugin does very little work within CE - just providing the API calls to decorate the PRs, and the facility to create branch entries in the database. Whilst the API calls could vary slightly between runs, that should be a matter of seconds end-to-end rather than the minutes you've demonstrated.

As you've indicated, the delay is likely to be due to the CE being single-threaded/single-process in community edition, so the jobs will be required to queue if something else is already processing. However, I'd expect to see more items in the table then. Can you confirm if you're looking at all tasks on the server then, or just the tasks for a single project?

dcherniv commented 4 years ago

@mc1arke The screenshot before was background jobs from one particular project: This is overall for entire server: image Note the time that is displayed above is pure running time, it excludes time waiting in the queue. In the above screenshot it waited in the queue for 2 seconds, then the analyze started. here's the log snippet related to screenshot above:

2020.02.26 22:25:16 INFO  ce[][o.s.c.t.CeWorkerImpl] Execute task | project=ml-trainer | type=REPORT | pullRequest=41 | id=AXCDmqNq3if_rdWF4Xs5 | submitter=jenkins
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Extract report | status=SUCCESS | time=5ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Persist scanner context | status=SUCCESS | time=2ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Propagate analysis warnings from scanner report | status=SUCCESS | time=1ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Execute DB migrations for current project | status=SUCCESS | time=1ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Generate analysis UUID | status=SUCCESS | time=0ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Load analysis metadata | status=SUCCESS | time=6ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Initialize | status=SUCCESS | time=0ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Verify billing | status=SUCCESS | time=0ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Build tree of components | components=2 | status=SUCCESS | time=3ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Validate project | status=SUCCESS | time=2ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Load quality profiles | status=SUCCESS | time=138ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Load Quality gate | status=SUCCESS | time=2ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Load new code period | status=SUCCESS | time=0ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Detect file moves | reportFiles=1 | dbFiles=1 | addedFiles=0 | status=SUCCESS | time=1ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Load duplications | duplications=0 | status=SUCCESS | time=1ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Compute cross project duplications | status=SUCCESS | time=0ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Compute size measures | status=SUCCESS | time=0ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Compute new coverage | status=SUCCESS | time=1ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Compute coverage measures | status=SUCCESS | time=0ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Compute comment measures | status=SUCCESS | time=0ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Copy custom measures | status=SUCCESS | time=2ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Compute duplication measures | status=SUCCESS | time=0ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Compute size measures on new code | status=SUCCESS | time=0ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Compute language distribution | status=SUCCESS | time=0ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Compute test measures | status=SUCCESS | time=0ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Compute complexity measures | status=SUCCESS | time=0ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Load measure computers | status=SUCCESS | time=0ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Compute Quality Profile status | status=SUCCESS | time=2ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Execute component visitors | status=SUCCESS | time=11ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Checks executed after computation of measures | status=SUCCESS | time=0ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Compute Quality Gate measures | status=SUCCESS | time=0ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Compute Quality profile measures | status=SUCCESS | time=0ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Generate Quality profile events | status=SUCCESS | time=1ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Generate Quality gate events | status=SUCCESS | time=0ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Persist components | status=SUCCESS | time=4ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Persist analysis | status=SUCCESS | time=2ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Persist analysis properties | status=SUCCESS | time=0ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Persist measures | inserts=77 | status=SUCCESS | time=7ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Persist live measures | insertsOrUpdates=111 | status=SUCCESS | time=14ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Persist duplication data | insertsOrUpdates=0 | status=SUCCESS | time=0ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Persist new ad hoc Rules | status=SUCCESS | time=0ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Persist issues | inserts=0 | updates=0 | merged=0 | untouched=0 | status=SUCCESS | time=1ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Persist project links | status=SUCCESS | time=0ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Persist events | status=SUCCESS | time=3ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Persist sources | status=SUCCESS | time=5ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Persist cross project duplications | status=SUCCESS | time=0ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Enable analysis | status=SUCCESS | time=3ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Update last usage date of quality profiles | status=SUCCESS | time=2ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Purge db | status=SUCCESS | time=22ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Index analysis | status=SUCCESS | time=5ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Send issue notifications | status=SUCCESS | time=0ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Publish task results | status=SUCCESS | time=0ms
2020.02.26 22:25:16 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.s.ComputationStepExecutor] Trigger refresh of Portfolios and Applications | status=SUCCESS | time=0ms
2020.02.26 22:25:56 INFO  ce[AXCDmqNq3if_rdWF4Xs5][o.s.c.t.p.a.p.PostProjectAnalysisTasksExecutor] Webhooks | globalWebhooks=1 | projectWebhooks=0 | status=SUCCESS | time=40035ms
2020.02.26 22:25:56 INFO  ce[AXCDmqNq3if_rdWF4Xs5][c.g.m.s.p.c.p.PullRequestPostAnalysisTask] using pull request decoratorBitbucketServer
2020.02.26 22:25:56 INFO  ce[AXCDmqNq3if_rdWF4Xs5][c.g.m.s.p.c.p.b.s.BitbucketServerPullRequestDecorator] starting to analyze with com.github.mc1arke.sonarqube.plugin.ce.pullrequest.AnalysisDetails@2b5be085

so at 22:25:56 we went into the plugin execution, 40 seconds after the analysis started.

dcherniv commented 4 years ago

Memory/CPU usage for sonarqube at the time: image Ah, i think i found the smoking gun. The logs are filled with:

alysisTask failed
java.lang.IllegalStateException: Could not decorate Pull Request on Bitbucket Server
        at com.github.mc1arke.sonarqube.plugin.ce.pullrequest.bitbucket.server.BitbucketServerPullRequestDecorator.decorateQualityGateStatus(BitbucketServerPullRequestDecorator.java:121)
        at com.github.mc1arke.sonarqube.plugin.ce.pullrequest.PullRequestPostAnalysisTask.finished(PullRequestPostAnalysisTask.java:161)
        at org.sonar.ce.task.projectanalysis.api.posttask.PostProjectAnalysisTasksExecutor.executeTask(PostProjectAnalysisTasksExecutor.java:118)
        at org.sonar.ce.task.projectanalysis.api.posttask.PostProjectAnalysisTasksExecutor.finished(PostProjectAnalysisTasksExecutor.java:109)
        at org.sonar.ce.task.step.ComputationStepExecutor.executeListener(ComputationStepExecutor.java:91)
        at org.sonar.ce.task.step.ComputationStepExecutor.execute(ComputationStepExecutor.java:63)
        at org.sonar.ce.task.projectanalysis.taskprocessor.ReportTaskProcessor.process(ReportTaskProcessor.java:81)
        at org.sonar.ce.taskprocessor.CeWorkerImpl$ExecuteTask.executeTask(CeWorkerImpl.java:209)
        at org.sonar.ce.taskprocessor.CeWorkerImpl$ExecuteTask.run(CeWorkerImpl.java:191)
        at org.sonar.ce.taskprocessor.CeWorkerImpl.findAndProcessTask(CeWorkerImpl.java:158)
        at org.sonar.ce.taskprocessor.CeWorkerImpl$TrackRunningState.get(CeWorkerImpl.java:133)
        at org.sonar.ce.taskprocessor.CeWorkerImpl.call(CeWorkerImpl.java:85)
        at org.sonar.ce.taskprocessor.CeWorkerImpl.call(CeWorkerImpl.java:53)
        at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
        at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
        at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
[...]
Caused by: org.apache.http.conn.HttpHostConnectException: Connect to bitbucket.example.com:443 [bitbucket.example.com/172.16.41.234, bitbucket.example.com/172.16.58.172, bitbucket.example.com
/172.16.13.238, bitbucket.example.com/172.16.19.47] failed: Operation timed out (Connection timed out)

Can we implement exponential back-off if its not too much trouble? i.e. try to decorate, wait for 3 seconds for response from bitbucket. If fails try again in 9 seconds, if that fails try again in 27 seconds. if it fails again then give up. Or something similar.

mc1arke commented 4 years ago

Can we implement exponential back-off if its not too much trouble?

I'll need to look at the options here. The Bitbucket decoration is using a CloseableHttpClient with a default configuration to perform Bitbucket API calls, so is likely to have retry enabled and use the DefaultHttpRequestRetryHandler, in which case it should be possible to use our own implementation with the back-off you've suggested.

Do you believe the plugin is impacting the performance of your Bitbucket host?

dcherniv commented 4 years ago

I have no reason to believe that bitbucket server performance is impacted. There's something peculiar though about ELBs and the IPs that are printed on the console:

Caused by: org.apache.http.conn.HttpHostConnectException: Connect to bitbucket.example.com:443 [bitbucket.example.com/172.16.41.234, bitbucket.example.com/172.16.58.172, bitbucket.example.com
/172.16.13.238, bitbucket.example.com/172.16.19.47] failed: Operation timed out (Connection timed out)

They are old. bitbucket is fronted by ELB, and i'm guessing here the name resolution never happens before making a call as evidenced in the below logs 172.16.19.47 is indeed no longer serving requests on port 443, whereas if you connect via fqdn it works.

root@devops-bitbucket-next-0:/var/atlassian/application-data/bitbucket# telnet 172.16.19.47 443
Trying 172.16.19.47...
^C
root@devops-bitbucket-next-0:/var/atlassian/application-data/bitbucket# telnet bitbucket.example.com 443
Trying 172.16.16.154...
Connected to internal-REDACTED-557787167.us-east-1.elb.amazonaws.com.
Escape character is '^]'.
^CConnection closed by foreign host.
root@devops-bitbucket-next-0:/var/atlassian/application-data/bitbucket# 
mc1arke commented 4 years ago

I think you're seeing a symptom of Java's DNS caching behaviour. Sonarqube applies a security manager into the Compute Engine since 8.0 which then means that Java reverts to caching DNS lookups for as long as is configured in the JVM's java.security file under the networkaddress.cache.ttl property (this property is not set on my current machine, so defaulting to -1 and therefore caching forever).

I suspect the Compute Engine will have tried to make network calls before the plugin can attempt to alter the security policy, so we're unlikely to be unable to use a programmatic solution to set a new TTL on the DNS cache. You'd therefore need to add networkaddress.cache.ttl=0 into $JAVA_HOME/jre/lib/security/java.security. Potentially setting sonar.ce.javaAdditionalOpts=-Dsun.net.inetaddr.ttl=0 in the $SONARQUBE_HOME/conf/sonar.properties filewill do the same thing (providing you're using an Sun/Oracle or OpenJDK JVM) , but I've not tested this an am not sure this is a fully supported mechanism.

dcherniv commented 4 years ago

@mc1arke Thank you. I will try this. Switched ce to run as follows:

  178 sonarqub  0:10 /usr/lib/jvm/java-11-openjdk/bin/java -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/opt/sonarqube/temp --add-opens=java.base/java.util=ALL-UNNAMED -Xmx2048m -Dsun.net.inetaddr.ttl=0 -Dhttp.nonProxyHosts=localhost|127.*|[::1] -cp ./lib/common/*:/opt/sonarqube/lib/jdbc/postgresql/postgresql-42.2.8.jar org.sonar.ce.app.CeServer /opt/sonarqube/temp/sq-process13826955709702133446properties