jenkinsci / azure-vm-agents-plugin

This repo is for azure vm agents plugin for jenkins. Azure devops CICD is the team which owns it for now
https://plugins.jenkins.io/azure-vm-agents/
43 stars 97 forks source link

Azure - offline suspended - thread blocks #376

Closed ROSMCCANN closed 1 year ago

ROSMCCANN commented 1 year ago

Jenkins and plugins versions report

Environment ```text Jenkins: 2.303.3.3 OS: Linux - 3.10.0-1160.76.1.el7.x86_64 --- ace-editor:1. analysis-model-api:10.3.1 android-signing:2.2.5 ansicolor:1.0.0 ant:1.12 antisamy-markup-formatter:2.1 apache-httpcomponents-client-4-api:4.5.13-1.0 artifact-diff-plugin:1.3 artifactdeployer:0.33 artifactory:3.10.5 async-http-client:1.7.24.3 audit-trail:3.10 authentication-tokens:1.4 aws-credentials:1.32 aws-device-farm:1.30 aws-java-sdk:1.11.995 aws-java-sdk-ec2:1.12.70 aws-java-sdk-elasticbeanstalk:1.12.70 aws-java-sdk-minimal:1.12.70 aws-lambda:0.5.10 awseb-deployment-plugin:0.3.21 azure-commons:1.1.3 azure-credentials:198.vf9c2fdfde55c azure-sdk:106.v552de1e64d56 azure-vm-agents:789.va0c40e4d0070 badge:1.9 blackduck-detect:3.1.0 blackduck-hub:4.0.1 blueocean:1.25.0 blueocean-autofavorite:1.2.4 blueocean-bitbucket-pipeline:1.25.0 blueocean-commons:1.25.0 blueocean-config:1.25.0 blueocean-core-js:1.25.0 blueocean-dashboard:1.25.0 blueocean-display-url:2.4.1 blueocean-events:1.25.0 blueocean-git-pipeline:1.25.0 blueocean-github-pipeline:1.25.0 blueocean-i18n:1.25.0 blueocean-jira:1.25.0 blueocean-jwt:1.25.0 blueocean-personalization:1.25.0 blueocean-pipeline-api-impl:1.25.0 blueocean-pipeline-editor:1.25.0 blueocean-pipeline-scm-api:1.25.0 blueocean-rest:1.25.0 blueocean-rest-impl:1.25.0 blueocean-web:1.25.0 bootstrap4-api:4.6.0-3 bootstrap5-api:5.1.1-1 bouncycastle-api:2.25 branch-api:2.7.0 build-cause-run-condition:0.1 build-discarder:60.v1747b0eb632a build-failure-analyzer:2.1.0 build-monitor-plugin:1.13+build.202110011223 build-name-setter:2.2.0 build-pipeline-plugin:1.5.8 build-timeout:1.20 build-user-vars-plugin:1.8 build-view-column:0.3 build-with-parameters:1.6 built-on-column:1.1 caffeine-api:2.9.2-29.v717aac953ff3 changes-since-last-success:0.6 checks-api:1.7.2 cloud-stats:0.27 cloudbees-aborted-builds:1.15 cloudbees-administrative-monitors:1.0.2 cloudbees-analytics:1.34 cloudbees-assurance:2.276.0.6 cloudbees-aws-cli:1.5.16 cloudbees-aws-credentials:1.8.5 cloudbees-aws-deployer:1.19 cloudbees-bitbucket-branch-source:2.9.11 cloudbees-blueocean-default-theme:0.8 cloudbees-consolidated-build-view:1.6.1 cloudbees-enterprise-plugins:15.05.1 cloudbees-even-scheduler:3.11 cloudbees-folder:6.16 cloudbees-folders-plus:3.22 cloudbees-github-pull-requests:1.1 cloudbees-groovy-view:1.13 cloudbees-ha:4.34 cloudbees-jenkins-advisor:3.3.2 cloudbees-jsync-archiver:5.19 cloudbees-label-throttling-plugin:3.8 cloudbees-license:9.60 cloudbees-long-running-build:1.17 cloudbees-monitoring:2.12 cloudbees-nodes-plus:1.22 cloudbees-platform-common:1.12 cloudbees-plugin-usage:2.11 cloudbees-quiet-start:1.7 cloudbees-request-filter:1.7 cloudbees-ssh-slaves:2.11 cloudbees-support:3.28 cloudbees-template:4.50 cloudbees-uc-data-api:4.44 cloudbees-unified-ui:1.13 cloudbees-view-creation-filter:1.7 cloudbees-wasted-minutes-tracker:3.8 cloudbees-workflow-aggregator:1.9.1 cloudbees-workflow-template:3.12 cloudbees-workflow-ui:2.6 cobertura:1.16 code-coverage-api:1.4.1 command-launcher:1.6 compress-artifacts:1.10 conditional-buildstep:1.4.1 config-file-provider:3.8.1 configurationslicing:1.52 confluence-publisher:2.0.6 copyartifact:1.46.2 credentials:2.6.1 credentials-binding:1.27 cucumber:0.0.2 cucumber-reports:5.6.1 cvs:2.19 dashboard-view:2.17 data-tables-api:1.11.3-1 datadog:4.0.0 delivery-pipeline-plugin:1.0.7 delphix:2.0.4 deploy:1.16 deployed-on-column:1.8 deployer-framework:1.3 description-setter:1.10 discard-old-build:1.05 display-url-api:2.3.5 docker-build-publish:1.3.3 docker-commons:1.17 docker-java-api:3.1.5.2 docker-plugin:1.2.3 docker-traceability:1.2 docker-workflow:1.26 dockerhub-notification:2.5.3 durable-task:1.39 ec2:1.65 echarts-api:5.2.1-2 email-ext:2.83 embeddable-build-status:2.0.3 emma:1.31 envinject:2.4.0 envinject-api:1.8 eros-registration-plugin:1.0.12 extended-choice-parameter:0.82 extended-read-permission:3.2 extended-security-settings:1.3 external-monitor-job:1.7 favorite:2.3.3 file-leak-detector:1.6 flyway-runner:1.9 font-awesome-api:5.15.4-1 forensics-api:1.3.1 fortify-on-demand-uploader:6.1.0 free-license:7.0 fxcop-runner:1.1 generic-webhook-trigger:1.77 ghprb:1.42.2 git:4.9.0 git-changelog:3.13 git-client:3.10.0 git-parameter:0.9.13 git-server:1.10 git-validated-merge:3.31 github:1.34.1 github-api:1.133 github-branch-source:2.11.3 github-organization-folder:1.6 github-pull-request-build:1.14 global-build-stats:1.5 global-variable-string-parameter:1.2 google-oauth-plugin:1.0.6 google-play-android-publisher:4.1 gradle:1.37.1 groovy:2.4 groovy-postbuild:2.5 handlebars:3.0.8 handy-uri-templates-2-api:2.1.8-1.0 hockeyapp:1.2.2 hp-application-automation-tools-plugin:7.2 htmlpublisher:1.25 http_request:1.11 hudsontrayapp:0.7.3 icon-shim:3.0.0 infradna-backup:3.38.41 ivy:2.1 jackson2-api:2.12.4 jacoco:3.3.0 javadoc:1.6 jaxb:2.3.0.1 jdk-tool:1.5 jenkins-design-language:1.25.0 jenkins-multijob-plugin:1.36 jira:3.6 jira-ext:0.9 jjwt-api:0.11.2-5.143e44951c52 jnr-posix-api:3.1.7-3 job-dsl:1.78.1 job-restrictions:0.8 jobConfigHistory:2.28.1 join:1.22-SNAPSHOT (private-682cfed6-mwaite) jquery:1.12.4-1 jquery-detached:1.2.1 jquery3-api:3.6.0-2 jsch:0.1.55.2 junit:1.53 kpp-management-plugin:1.0.0 ldap:2.7 m2release:0.16.2 mail-watcher-plugin:1.16 mailer:1.34 mapdb-api:1.0.9.0 mask-passwords:3.0 matrix-auth:2.6.8 matrix-project:1.19 maven-plugin:3.15 mercurial:2.15 metrics:4.0.2.8 momentjs:1.1.1 monitoring:1.87.0 msbuild:1.30 mstest:1.0.0 mstestrunner:1.3 nectar-license:8.35 nectar-rbac:5.62 nectar-vmware:4.3.9 node-iterator-api:1.5.1 nodejs:1.4.0 nodelabelparameter:1.9.2 notification:1.15 nuget:1.1 oauth-credentials:0.4 okhttp-api:3.14.9 operations-center-agent:2.303.0.2 operations-center-analytics-config:2.222.0.1 operations-center-analytics-reporter:2.222.0.1 operations-center-client:2.303.0.1 operations-center-cloud:2.303.0.1 operations-center-context:2.303.0.9 pam-auth:1.6 parameterized-scheduler:1.0 parameterized-trigger:2.39 perfectomobile:2.41.0.1 performance:3.20 persistent-parameter:1.3 pipeline-aws:1.43 pipeline-build-step:2.15 pipeline-github-lib:1.0 pipeline-graph-analysis:1.11 pipeline-input-step:2.12 pipeline-milestone-step:1.3.2 pipeline-model-api:1.9.2 pipeline-model-declarative-agent:1.1.1 pipeline-model-definition:1.9.2 pipeline-model-extensions:1.9.2 pipeline-rest-api:2.19 pipeline-stage-step:2.5 pipeline-stage-tags-metadata:1.9.2 pipeline-stage-view:2.19 pipeline-utility-steps:2.10.0 plain-credentials:1.7 plugin-util-api:2.5.0 popper-api:1.16.1-2 popper2-api:2.10.1-1 powershell:1.7 promoted-builds:3.10 publish-over:0.22 publish-over-ftp:1.16 publish-over-ssh:1.22 pubsub-light:1.16 pyenv-pipeline:2.1.2 rebuild:1.32 regexemail:0.3 resource-disposer:0.16 rich-text-publisher-plugin:1.4 robot:3.0.1 ruby-runtime:0.13 run-condition:1.5 s3:0.11.10 saml:2.0.9 sauce-ondemand:1.197 sbt:1.5 scm-api:2.6.5 scoverage:1.4.0 script-security:1.78 scriptler:3.3 selenium:3.141.59 shared-workspace:1.0.2 shelve-project-plugin:3.2 skip-certificate-check:1.0 skip-plugin:4.10 snakeyaml-api:1.29.1 sonar:2.13.1 splunk-devops:1.9.9 splunk-devops-extend:1.9.7 sse-gateway:1.24 ssh-agent:1.23 ssh-credentials:1.19 ssh-slaves:1.33.0 ssh-steps:2.0.0 ssh2easy:1.4 sshd:3.1.0 structs:1.23 subversion:2.15.1 support-core:2.76 suppress-stack-trace:1.6 teamconcert:2.4.0 test-results-analyzer:0.3.5 testng-plugin:1.15 text-file-operations:1.3.2 text-finder-run-condition:0.1 thinBackup:1.10 throttle-concurrents:2.4 timestamper:1.13 token-macro:266.v44a80cf277fd translation:1.15 trilead-api:1.0.13 unique-id:2.2.0 uno-choice:2.5.6 user-activity-monitoring:1.4 variant:1.4 veracode-jenkins-plugin:17.6.4.1 versionnumber:1.9 view-job-filters:2.3 vstestrunner:1.0.8 warnings-ng:9.5.0 wikitext:3.14 windows-slaves:1.8 wix:1.12 workflow-aggregator:2.5 workflow-api:2.47 workflow-basic-steps:2.24 workflow-cps:2.94 workflow-cps-checkpoint:2.10 workflow-cps-global-lib:2.21 workflow-durable-task-step:2.40 workflow-job:2.41 workflow-multibranch:2.26 workflow-scm-step:2.13 workflow-step-api:2.24 workflow-support:3.8 ws-cleanup:0.39 xvfb:1.2 zentimestamp:4.2 zephyr-enterprise-test-management:2.2 ```

What Operating System are you using (both controller, and any agents involved in the problem)?

Redhat 7 utilised for Jenkins Controller

Reproduction steps

Jenkins jobs set on a timer to spin up azure workers and execute builds

Expected Results

Jenkins spins up Azure workers and executes builds successfully

Actual Results

Sporadically we are seeing the Azure workers going into an offline suspended status on the deprovision.

No further Azure workers can be spun up as Jenkins sees those workers in the environment.

The cleanup task starts hitting the timeout and we sometimes see the cup/load hit maximum on the environment.

Only recovery I've found is a restart of the Jenkins instance

Anything else?

If we try and manually delete the workers it causes further thread blocks, which start affecting other plugins such as the EC2 plugin.

timja commented 1 year ago

If this is still an issue please provide your plugin configuration, ideally by exporting using the configuration as code plugin.

I can't reproduce in the current state but if the plugin configuration is provided I will re-open