Plugin won't provision Azure VMs #354

Open SanderMachado opened 2 years ago

SanderMachado commented 2 years ago

Jenkins and plugins versions report

Environment ```text Jenkins: 2.319.1 OS: Linux - 4.4.0-210-generic --- PrioritySorter:4.0.1 TestComplete:2.8.1 ace-editor:1.1 ansicolor:1.0.1 ant:1.13 antisamy-markup-formatter:2.5 apache-httpcomponents-client-4-api:4.5.13-1.0 authentication-tokens:1.4 authorize-project:1.4.0 azure-credentials:216.ve0b_4a_485ffc2 azure-sdk:106.v552de1e64d56 azure-vm-agents:808.v9d1999587120 badge:1.9 blueocean:1.25.2 blueocean-autofavorite:1.2.4 blueocean-bitbucket-pipeline:1.25.2 blueocean-commons:1.25.2 blueocean-config:1.25.2 blueocean-core-js:1.25.2 blueocean-dashboard:1.25.2 blueocean-display-url:2.4.1 blueocean-events:1.25.2 blueocean-git-pipeline:1.25.2 blueocean-github-pipeline:1.25.2 blueocean-i18n:1.25.2 blueocean-jwt:1.25.2 blueocean-personalization:1.25.2 blueocean-pipeline-api-impl:1.25.2 blueocean-pipeline-editor:1.25.2 blueocean-pipeline-scm-api:1.25.2 blueocean-rest:1.25.2 blueocean-rest-impl:1.25.2 blueocean-web:1.25.2 bootstrap4-api:4.6.0-3 bootstrap5-api:5.1.3-3 bouncycastle-api:2.25 branch-api:2.7.0 build-timeout:1.20 build-user-vars-plugin:1.8 built-on-column:1.1 caffeine-api:2.9.2-29.v717aac953ff3 checks-api:1.7.2 cloud-stats:0.27 cloudbees-bitbucket-branch-source:734.v2f848c5e6ea2 cloudbees-folder:6.16 command-launcher:1.6 conditional-buildstep:1.4.1 copyartifact:1.46.2 credentials:2.6.2 credentials-binding:1.27.1 dark-theme:155.v497c78bbdbb3 display-url-api:2.3.5 docker-commons:1.17 docker-workflow:1.26 durable-task:493.v195aefbb0ff2 echarts-api:5.2.2-1 email-ext:2.86 envinject:2.4.0 envinject-api:1.8 extended-choice-parameter:0.82 extended-read-permission:3.2 external-monitor-job:1.7 favorite:2.3.3 flexible-publish:0.16.1 folder-properties:1.2.1 font-awesome-api:5.15.4-4 git:4.10.0 git-client:3.10.0 git-server:1.10 github:1.34.1 github-api:1.301-378.v9807bd746da5 github-branch-source:2.11.3 gitlab-plugin:1.5.24 gradle:1.37.1 groovy-label-assignment:1.2.0 groovy-postbuild:2.5 handlebars:3.0.8 handy-uri-templates-2-api:2.1.8-1.0 htmlpublisher:1.28 http_request:1.12 jackson2-api:2.13.0-230.v59243c64b0a5 javadoc:1.6 jdk-tool:1.5 jenkins-design-language:1.25.2 jenkins-multijob-plugin:1.36 jira:3.6 jjwt-api:0.11.2-9.c8b45b8bb173 jobConfigHistory:2.28.1 jquery:1.12.4-1 jquery-detached:1.2.1 jquery3-api:3.6.0-2 jsch: junit:1.53 ldap:2.7 lockable-resources:2.12 log-parser:2.1 mailer:1.34 mapdb-api: material-theme:0.4.1 matrix-auth:2.6.11 matrix-combinations-parameter:1.3.1 matrix-groovy-execution-strategy:1.0.7 matrix-project:1.19 maven-plugin:3.15.1 mercurial:2.16 metrics: momentjs:1.1.1 monitoring:1.88.0 nodelabelparameter:1.10.1 okhttp-api:4.9.3-105.vb96869f8ac3a p4:1.11.6 pam-auth:1.6.1 parameterized-scheduler:1.0 parameterized-trigger:2.42 periodic-reincarnation:1.13 pipeline-build-step:2.15 pipeline-github-lib:1.0 pipeline-graph-analysis:1.12 pipeline-input-step:427.va6441fa17010 pipeline-milestone-step:1.3.2 pipeline-model-api:1.9.3 pipeline-model-definition:1.9.3 pipeline-model-extensions:1.9.3 pipeline-rest-api:2.19 pipeline-stage-step:2.5 pipeline-stage-tags-metadata:1.9.3 pipeline-stage-view:2.19 pipeline-utility-steps:2.11.0 plain-credentials:1.8 plot:2.1.9 plugin-util-api:2.6.0 popper-api:1.16.1-2 popper2-api:2.10.2-1 postbuildscript:0.17 powershell:1.7 preSCMbuildstep:0.3 pubsub-light:1.16 pvs-studio:7.15 resource-disposer:0.16 role-strategy:3.2.0 run-condition:1.5 saml:2.0.9 scm-api:2.6.5 scoring-load-balancer:1.0.1 script-security:1.78 simple-theme-plugin:0.7 slack:2.49 snakeyaml-api:1.29.1 solarized-theme:0.1 sse-gateway:1.24 ssh-credentials:1.19 ssh-slaves:1.29.4 sshd:3.1.0 structs:308.v852b473a2b8c subversion:2.15.1 support-core:2.79 theme-manager:0.6 timestamper:1.15 token-macro:267.vcdaea6462991 trilead-api:1.0.13 variant:1.4 versioncolumn:2.2 windows-slaves:1.8 workflow-aggregator:2.6 workflow-api:1105.v3de5e2efac97 workflow-basic-steps:2.24 workflow-cps:2640.v00e79c8113de workflow-cps-global-lib:552.vd9cc05b8a2e1 workflow-durable-task-step:1102.v9c8d2f466adb workflow-job:2.42 workflow-multibranch:2.26 workflow-scm-step:2.13 workflow-step-api:613.v375732a042b1 workflow-support:3.8 ws-cleanup:0.39 ```

What Operating System are you using (both controller, and any agents involved in the problem)?

Ubuntu hosting jenkins NOT on Azure, trying to provision Windows 10 vms on Azure.

Reproduction steps

  1. Create an azure service principal in the Azure Portal
  2. add a new Microsoft Azure Service Principal credential to jenkins
  3. add a Azure Profile Configuration and verify it image
  4. In the General Configuration part set all the values but get an error for the region (I managed to set the region using the groovy script) image
  5. Set the label to 'azure'
  6. try to run a pipeline like this
    `pipeline {
    agent any
    stages {
        stage('Hello') {
            agent {
                node {
                    label 'azure'
            steps {
                echo 'Hello World'a

Expected Results

The plugin will create a new VM and run the job on it

Actual Results

Pipeline job gets stuck on [Pipeline] node Still waiting to schedule task There are no nodes with the label ‘[azure](http://jenkins/label/azure/)’

Anything else?

In the plugin logs I can only see.

`Pipeline job gets stuck on

Still waiting to schedule task
There are no nodes with the label ‘[azure](http://jenkins/label/azure/)’``
`Started Azure VM Agents Clean Task
May 11, 2022 3:24:58 PM FINE execute
May 11, 2022 3:24:58 PM FINE execute
Running clean with 15 minute timeout
May 11, 2022 3:24:58 PM FINE cleanVMs
May 11, 2022 3:24:58 PM FINE cleanVMs
May 11, 2022 3:24:58 PM FINE cleanDeployments
Cleaning deployments
May 11, 2022 3:24:58 PM FINE cleanDeployments
Done cleaning deployments
May 11, 2022 3:24:58 PM FINE cleanLeakedResources
May 11, 2022 3:24:59 PM FINE cleanLeakedResources
cleanLeakedResources: beginning to look at leaked resources in rg: RG-WEU-DEVOPS-PRD
May 11, 2022 3:24:59 PM FINE cleanLeakedResources
cleanLeakedResources: %d resources marked for deletion0
May 11, 2022 3:24:59 PM FINE cleanLeakedResources
May 11, 2022 3:24:59 PM FINE cleanCloudStatistics
May 11, 2022 3:24:59 PM FINE cleanCloudStatistics
May 11, 2022 3:24:59 PM FINE execute
May 11, 2022 3:24:59 PM FINE hudson.model.AsyncPeriodicWork lambda$doRun$1
Finished Azure VM Agents Clean Task. 799 ms`
`May 11, 2022 3:26:19 PM FINE getAzureAgentTemplate
AzureVMCloud: getAzureAgentTemplate: Found agent template ubuntu
May 11, 2022 3:26:19 PM FINE getAzureAgentTemplate
AzureVMCloud: getAzureAgentTemplate: ubuntu matches!
May 11, 2022 3:26:19 PM FINE getAzureAgentTemplate
AzureVMCloud: getAzureAgentTemplate: Retrieving agent template with label ubuntu
May 11, 2022 3:26:19 PM FINE getAzureAgentTemplate
AzureVMCloud: getAzureAgentTemplate: Found agent template ubuntu`
(this log was from when I tried an ubuntu label
timja commented 2 years ago

Any chance you're behind a proxy?

If you can't load the region list setting it via a groovy script won't help, that means it can't contact the Azure API.

I would expect to see errors in the system log though

SanderMachado commented 2 years ago

It is behind a firewall, but I'm not seeing any denied request. One time I got a

May 11, 2022 4:19:58 PM INFO performLogging
Azure Identity => getToken() result for scopes []: SUCCESS
May 11, 2022 4:19:58 PM INFO info
Acquired a new access token at 299 seconds before expiry. Retry may be attempted after 30 seconds. The token currently cached will be used.

in the logs and then it would correctly load the vm sizes but not the regions. When I look at it now it's back to the fallback list.

I made sure to check the logs but couldn't find any failed requests

timja commented 2 years ago

Strange =/

SanderMachado commented 2 years ago

I managed to find an error in the access log - - [11/May/2022:17:39:03 +0200] "POST /descriptorByName/ HTTP/1.1" 499 0 "https://jenkins/configureClouds/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.64 Safari/537.36"

HTTP error 499 simply means that the client shut off in the middle of processing the request through the server. I don't know if this is of any help in finding the issue