jenkinsci / pipeline-graph-view-plugin

https://plugins.jenkins.io/pipeline-graph-view/
Other
107 stars 54 forks source link

On large pipeline, generating `pipeline-graph/tree` page takes long time #385

Closed eplodn closed 5 days ago

eplodn commented 5 months ago

Jenkins and plugins versions report

Environment ```text Jenkins: 2.440.1 OS: Linux - 5.4.0-1103-aws Java: 17.0.10 - Eclipse Adoptium (OpenJDK 64-Bit Server VM) --- allure-jenkins-plugin:2.31.1 amazon-ecr:1.114.vfd22430621f5 analysis-model-api:12.1.0 ansicolor:1.0.4 antisamy-markup-formatter:162.v0e6ec0fcfcf6 apache-httpcomponents-client-4-api:4.5.14-208.v438351942757 artifact-manager-s3:845.848.ve968b_8576c1c asm-api:9.6-3.v2e1fa_b_338cd7 authentication-tokens:1.53.v1c90fd9191a_b_ aws-bucket-credentials:1.0.0 aws-credentials:218.v1b_e9466ec5da_ aws-global-configuration:130.v35b_7b_96f53c3 aws-java-sdk:1.12.671-439.veec746c91fcb_ aws-java-sdk-cloudformation:1.12.671-439.veec746c91fcb_ aws-java-sdk-codebuild:1.12.671-439.veec746c91fcb_ aws-java-sdk-ec2:1.12.671-439.veec746c91fcb_ aws-java-sdk-ecr:1.12.671-439.veec746c91fcb_ aws-java-sdk-ecs:1.12.671-439.veec746c91fcb_ aws-java-sdk-efs:1.12.671-439.veec746c91fcb_ aws-java-sdk-elasticbeanstalk:1.12.671-439.veec746c91fcb_ aws-java-sdk-iam:1.12.671-439.veec746c91fcb_ aws-java-sdk-kinesis:1.12.671-439.veec746c91fcb_ aws-java-sdk-logs:1.12.671-439.veec746c91fcb_ aws-java-sdk-minimal:1.12.671-439.veec746c91fcb_ aws-java-sdk-secretsmanager:1.12.671-439.veec746c91fcb_ aws-java-sdk-sns:1.12.671-439.veec746c91fcb_ aws-java-sdk-sqs:1.12.671-439.veec746c91fcb_ aws-java-sdk-ssm:1.12.671-439.veec746c91fcb_ basic-branch-build-strategies:81.v05e333931c7d blueocean:1.27.11 blueocean-bitbucket-pipeline:1.27.11 blueocean-commons:1.27.11 blueocean-config:1.27.11 blueocean-core-js:1.27.11 blueocean-dashboard:1.27.11 blueocean-display-url:2.4.2 blueocean-events:1.27.11 blueocean-git-pipeline:1.27.11 blueocean-github-pipeline:1.27.11 blueocean-i18n:1.27.11 blueocean-jwt:1.27.11 blueocean-personalization:1.27.11 blueocean-pipeline-api-impl:1.27.11 blueocean-pipeline-editor:1.27.11 blueocean-pipeline-scm-api:1.27.11 blueocean-rest:1.27.11 blueocean-rest-impl:1.27.11 blueocean-web:1.27.11 bootstrap5-api:5.3.3-1 bouncycastle-api:2.30.1.77-225.v26ea_c9455fd9 branch-api:2.1152.v6f101e97dd77 build-token-root:151.va_e52fe3215fc buildtriggerbadge:251.vdf6ef853f3f5 caffeine-api:3.1.8-133.v17b_1ff2e0599 checks-api:2.0.2 cloudbees-bitbucket-branch-source:880.vcf4056c5a_71f cloudbees-disk-usage-simple:203.v3f46a_7462b_1a_ cloudbees-folder:6.901.vb_4c7a_da_75da_3 cobertura:1.17 code-coverage-api:4.99.0 command-launcher:107.v773860566e2e commons-lang3-api:3.13.0-62.v7d18e55f51e2 commons-text-api:1.11.0-95.v22a_d30ee5d36 copyartifact:722.v0662a_9b_e22a_c coverage:1.13.0 credentials:1337.v60b_d7b_c7b_c9f credentials-binding:657.v2b_19db_7d6e6d data-tables-api:2.0.2-1 display-url-api:2.200.vb_9327d658781 docker-commons:439.va_3cb_0a_6a_fb_29 durable-task:550.v0930093c4b_a_6 ec2:1648.vf3d852e00486 ec2-fleet:3.2.0 echarts-api:5.5.0-1 extended-read-permission:53.v6499940139e5 external-monitor-job:215.v2e88e894db_f8 favorite:2.208.v91d65b_7792a_c font-awesome-api:6.5.1-3 forensics-api:2.4.0 generic-webhook-trigger:2.0.2 git:5.2.1 git-client:4.7.0 git-parameter:0.9.19 git-server:114.v068a_c7cc2574 github:1.38.0 github-api:1.318-461.v7a_c09c9fa_d63 github-branch-source:1781.va_153cda_09d1b_ github-oauth:597.ve0c3480fcb_d0 github-scm-trait-notification-context:40.vfa_7f31a_b_d7f8 groovy:457.v99900cb_85593 gson-api:2.10.1-15.v0d99f670e0a_7 handy-uri-templates-2-api:2.1.8-30.v7e777411b_148 htmlpublisher:1.33 http_request:1.18 instance-identity:185.v303dc7c645f9 ionicons-api:56.v1b_1c8c49374e jackson2-api:2.16.2-378.v7e79818f53ce jakarta-activation-api:2.1.3-1 jakarta-mail-api:2.1.3-1 javadoc:243.vb_b_503b_b_45537 javax-activation-api:1.2.0-6 javax-mail-api:1.6.2-9 jaxb:2.3.9-1 jdk-tool:73.vddf737284550 jenkins-design-language:1.27.11 jersey2-api:2.41-133.va_03323b_a_1396 jjwt-api:0.11.5-77.v646c772fddb_0 jobConfigHistory:1229.v3039470161a_d joda-time-api:2.12.7-29.v5a_b_e3a_82269a_ jquery:1.12.4-1 jquery3-api:3.7.1-2 jsch:0.2.16-86.v42e010d9484b_ json-api:20240303-41.v94e11e6de726 json-path-api:2.9.0-52.v57de85cc4722 junit:1259.v65ffcef24a_88 ldap:711.vb_d1a_491714dc mailer:472.vf7c289a_4b_420 mapdb-api:1.0.9-28.vf251ce40855d material-theme:0.5.2-rc100.6121925fe229 matrix-auth:3.2.2 matrix-project:822.824.v14451b_c0fd42 maven-plugin:3.23 metrics:4.2.21-449.v6960d7c54c69 mina-sshd-api-common:2.12.0-90.v9f7fb_9fa_3d3b_ mina-sshd-api-core:2.12.0-90.v9f7fb_9fa_3d3b_ node-iterator-api:55.v3b_77d4032326 okhttp-api:4.11.0-172.vda_da_1feeb_c6e pam-auth:1.10 parameterized-scheduler:262.v00f3d90585cc pipeline-aws:1.43 pipeline-build-step:540.vb_e8849e1a_b_d8 pipeline-github:2.8-159.09e4403bc62f pipeline-github-lib:42.v0739460cda_c4 pipeline-graph-analysis:216.vfd8b_ece330ca_ pipeline-graph-view:237.v8959b_75f3c47 pipeline-groovy-lib:704.vc58b_8890a_384 pipeline-input-step:491.vb_07d21da_1a_fb_ pipeline-milestone-step:111.v449306f708b_7 pipeline-model-api:2.2184.v0b_358b_953e69 pipeline-model-definition:2.2184.v0b_358b_953e69 pipeline-model-extensions:2.2184.v0b_358b_953e69 pipeline-rest-api:2.34 pipeline-stage-step:305.ve96d0205c1c6 pipeline-stage-tags-metadata:2.2184.v0b_358b_953e69 pipeline-stage-view:2.34 pipeline-utility-steps:2.16.2 plain-credentials:179.vc5cb_98f6db_38 plugin-util-api:4.1.0 prism-api:1.29.0-13 prometheus:2.5.1 pubsub-light:1.18 read-only-configurations:1.10 rebuild:330.v645b_7df10e2a_ resource-disposer:0.23 s3:466.vf5b_3db_8e3eb_2 scm-api:689.v237b_6d3a_ef7f scm-filter-branch-pr:148.v0b_5f06e8b_c84 script-security:1326.vdb_c154de8669 simple-theme-plugin:176.v39740c03a_a_f5 slack:684.v833089650554 snakeyaml-api:2.2-111.vc6598e30cc65 sse-gateway:1.26 ssh-credentials:322.v124df57ed808 ssh-slaves:2.948.vb_8050d697fec sshd:3.322.v159e91f6a_550 structs:337.v1b_04ea_4df7c8 subversion:2.17.3 theme-manager:215.vc1ff18d67920 throttle-concurrents:2.14 timestamper:1.26 token-macro:400.v35420b_922dcb_ trilead-api:2.142.v748523a_76693 uno-choice:2.8.1 variant:60.v7290fc0eb_b_cd warnings-ng:11.2.1 workflow-aggregator:596.v8c21c963d92d workflow-api:1291.v51fd2a_625da_7 workflow-basic-steps:1049.v257a_e6b_30fb_d workflow-cps:3883.vb_3ff2a_e3eea_f workflow-durable-task-step:1331.vc8c2fed35334 workflow-job:1400.v7fd111b_ec82f workflow-multibranch:773.vc4fe1378f1d5 workflow-scm-step:427.v4ca_6512e7df1 workflow-step-api:657.v03b_e8115821b_ workflow-support:865.v43e78cc44e0d ws-cleanup:0.45 ```

What Operating System are you using (both controller, and any agents involved in the problem)?

Ubuntu Linux on AWS

Reproduction steps

Timing the request is done with wget but developer console shows similar times.

time wget https://jenkins-aws.mycompany.net/job/mycompany/job/product/job/PR-56789/2/pipeline-graph/tree
--2024-04-18 19:32:12--  https://jenkins-aws.mycompany.net/job/mycompany/job/product/job/PR-56789/2/pipeline-graph/tree
Resolving jenkins-aws.mycompany.net (jenkins-aws.mycompany.net)... 172.xxx.xxx.xxx
Connecting to jenkins-aws.mycompany.net (jenkins-aws.mycompany.net)|172.xxx.xxx.xxx|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 153064 (149K) [application/json]
Saving to: ‘tree’

tree                            100%[======================================================>] 149.48K   329KB/s    in 0.5s

2024-04-18 19:34:40 (329 KB/s) - ‘tree’ saved [153064/153064]

The resulting file is ~150Kb.

Expected Results

I expected the request to return in several seconds.

Actual Results

The request returned in about 150 seconds.

wget   0.03s user 0.03s system 0% cpu 2:27.87 total

Anything else?

The pipeline returned has about 450 nodes.

Are you interested in contributing a fix?

No response

timja commented 4 months ago

can you please provide an example pipeline that this can be reproduced with?

smaduel-dn commented 3 months ago

Hi Tim,

I am Snir working with Efi (The reporter). We will try to create a simple pipeline that reproduce the issue and upload it. In our testing, We see a direct connection between pipeline log size and loading time. Every 10MB of logs takes about 1 sec of loading time. Since our pipeline produce 100-1200 MBs of logs, the loading time become too long (10-120 sec).

Can you try improve the plugin pipeline load time when it have a big log file?

Thanks, Snir

timja commented 3 months ago

if you can provide a sample pipeline that reproduces the issue then yes it can be looked at.

eplodn commented 3 months ago

Ok, so here is a small example I came down with that, like our actual pipeline, has lots of stages, lots of steps, and lots of logs.

In the bottom line this is not as bad as our actual 150 seconds but still 23 seconds:

time curl -sL https://<my-jenkins>/job/EfiTest/11/pipeline-graph/tree
real    0m23.736s
user    0m0.007s
sys 0m0.014s
String LABEL_WE_HAVE_A_LOT_OF = 'jenkins-test-tiny'

pipeline {
    agent { node { label 'master || built-in'} }

    stages{
        stage ('Run 1') {
            steps {
                script {
                    try { def builders = generateBuilders(); parallel builders
                    } catch (Exception e) { println "Going on" }
                }
            }
        }
        stage ('Run 2') {
            steps {
                script {
                    try { def builders = generateBuilders(); parallel builders
                    } catch (Exception e) { println "Going on" }
                }
            }
        }
        stage ('Run 3') {
            steps {
                script {
                    try { def builders = generateBuilders(); parallel builders
                    } catch (Exception e) { println "Going on" }
                }
            }
        }
        stage ('Run 4') {
            steps {
                script {
                    try { def builders = generateBuilders(); parallel builders
                    } catch (Exception e) { println "Going on" }
                }
            }
        }
        stage ('Run 5') {
            steps {
                script {
                    try { def builders = generateBuilders(); parallel builders
                    } catch (Exception e) { println "Going on" }
                }
            }
        }
    }

    options {
        timeout(time: 3, unit: 'HOURS')
        buildDiscarder(logRotator(numToKeepStr: '2', artifactNumToKeepStr: '2'))
        ansiColor('gnome-terminal')
        timestamps()
    }
}

def generateLotsOfLogs(String final_node_name) {
    sh "sudo find /"
}

def generateBuilders() {
    def builders = [:].asSynchronized()
    for (int i = 0; i < 50; i++) {
        def final_node_name = "node_${i}"
        builders[final_node_name] = {
            node('jenkins-test-tiny') {
                timeout(time: 5, unit: "MINUTES") {
                    echo "Triggering on " + final_node_name
                    def usage = 0;
                    for (int j = 0; j < 50; j++) {
                        sh """sudo find /tmp/ -type f -name "core"  -print """
                    }
                    generateLotsOfLogs(final_node_name)
                }
            }
        }
    }
    return builders
}

Here's a similar nodes request from BlueOcean:

time curl -sL  https://my-jenkins/blue/rest/organizations/jenkins/pipelines/EfiTest/runs/11/nodes/?limit=10000
real    0m0.032s
user    0m0.017s
sys 0m0.004s

The full log comes at about 7GB, naturally YMMV.

timja commented 3 months ago

I tried to reproduce yesterday, but the script produced too many logs and filled up my disk.

And when I managed to stop the job it wasn't able to build a tree anymore so returned instantly, it was definitely slower while it was running but ideally it would be stopped so I can profile it easier and see where the time is being taken.

I'll try again at some point its just a pain to setup with that reproducer.

eplodn commented 3 months ago

You can play with the amount of logs by replacing

 sh "sudo find /"

with

 sh "sudo find / | head -50000"

Of course you can vary as needed.

johannesacco commented 6 days ago

Hi, We have also noticed the slow performance which I think started after some update earlier this year. Now the Blue Ocean is much faster (it used to be the opposite). I have also noticed that the load on the server continues even if I close my browser (I have seen requests taking > 30 minutes) making the whole Jenkins server slow and almost unresponsive.

By taking a stack trace when this happens I noticed that it spends a lot of time for logging even if I haven't enabled debug logging. Example1:

at io.jenkins.plugins.pipelinegraphview.utils.FlowNodeWrapper.getNodeGraphviz(FlowNodeWrapper.java:432)
at io.jenkins.plugins.pipelinegraphview.treescanner.PipelineNodeGraphAdapter.dumpNodeGraphviz(PipelineNodeGraphAdapter.java:122)
at io.jenkins.plugins.pipelinegraphview.treescanner.PipelineNodeGraphAdapter.dumpNodeGraphviz(PipelineNodeGraphAdapter.java:126)
at io.jenkins.plugins.pipelinegraphview.treescanner.PipelineNodeGraphAdapter.remapStageParentage(PipelineNodeGraphAdapter.java:146)
at io.jenkins.plugins.pipelinegraphview.treescanner.PipelineNodeGraphAdapter.getPipelineNodes(PipelineNodeGraphAdapter.java:232)
at io.jenkins.plugins.pipelinegraphview.treescanner.PipelineNodeGraphAdapter.remapStepParentage(PipelineNodeGraphAdapter.java:209)
at io.jenkins.plugins.pipelinegraphview.treescanner.PipelineNodeGraphAdapter.getAllSteps(PipelineNodeGraphAdapter.java:239)
at io.jenkins.plugins.pipelinegraphview.utils.PipelineStepApi.getAllSteps(PipelineStepApi.java:85)

Example 2:

at io.jenkins.plugins.pipelinegraphview.utils.FlowNodeWrapper.getNodeGraphviz(FlowNodeWrapper.java:432)
at io.jenkins.plugins.pipelinegraphview.treescanner.PipelineNodeTreeScanner.build(PipelineNodeTreeScanner.java:84)
at io.jenkins.plugins.pipelinegraphview.treescanner.PipelineNodeTreeScanner.<init>(PipelineNodeTreeScanner.java:57)
at io.jenkins.plugins.pipelinegraphview.treescanner.PipelineNodeGraphAdapter.<init>(PipelineNodeGraphAdapter.java:34)
at io.jenkins.plugins.pipelinegraphview.utils.PipelineStepApi.getAllSteps(PipelineStepApi.java:100)
at io.jenkins.plugins.pipelinegraphview.consoleview.PipelineConsoleViewAction.getAllSteps(PipelineConsoleViewAction.java:101)
at io.jenkins.plugins.pipelinegraphview.consoleview.PipelineConsoleViewAction.getAllSteps(PipelineConsoleViewAction.java:96)
at java.lang.invoke.LambdaForm$DMH/0x00000008011d9040.invokeVirtual(java.base@11.0.24/LambdaForm$DMH)

In above examples the return string from FlowNodeWrapper.getNodeGraphviz(remappedNodes) is only used when debug logs are enabled, but the string is always created.

Loking around in the code I have noticed a couple of things that would probably speed things up a bit:

1) Use StringBuilder() in FlowNodeWrapper.getNodeGraphviz(remappedNodes) 2) Avoid calling FlowNodeWrapper.getNodeGraphviz(remappedNodes) (and similar functions) unless debug logging is activated. E.g. by replace the dump() function with if (isDebugEnabled) logger.debug(...) 3) Use log4j "{} {}" instead of String.format("%s %d") when logging

If you think above is a good idea I could create a couple of pull requests for this.

Let me know what you think! Thanks /Johannes