Open michalvavrik opened 1 year ago
Thanks for the report! I am not sure if this is a problem of openstack-cloud
plugin in failing to bring the node online, or cloud-stats-plugin
in failing to represent node status correctly.
@olivergondza thanks for getting back to me so quickly
Thanks for the report! I am not sure if this is a problem of
openstack-cloud
plugin in failing to bring the node online, orcloud-stats-plugin
in failing to represent node status correctly.
- Does the node report to be online in Jenkins, or not? Is the node accepting builds?
No, node is N/A (highlighted in red) and a job run is cancelled after timeout, but node is never terminated. I have a reason why I believe it's an openstack-cloud
issue - if you inspect logging above, you can see there is never Agent successfully connected and online
among other logging events.
Also I'd like to stress out this happen quite "randomly" (meaning I can't related it to the cause) as sometimes there is 20 nodes provisioned correctly in row, sometimes it's couple of this in row. On average, the "launch" phase is done within 14 seconds.
- Do you have enough samples to be confident this appears on Windows machines only?
I can't disclose details, but we have quite a volume of linux-based runs and I can't see it there.
- Can you extract Jenkins service logs and grep it for the node name?
I'll try that next time I see the problem and get back to you here. In short - ATM no, but I'll be watching it.
@olivergondza, me and @pematous have encountered this issue recently. We decided to workaround it by not running Jenkins Agent on the Windows machines. Instead, we spawn Windows machines using openstackMachine step, which works reliably, and then connect to the machine by its IP using SSH/Ansible, as needed.
It is less convenient than having the Jenkins agent available, but it works!
Jenkins and plugins versions report
Environment
```text Jenkins: 2.361.2 OS: Linux - 4.18.0-305.65.1.el8_4.x86_64 --- Parameterized-Remote-Trigger:3.1.6.3 ace-editor:1.1 ansicolor:1.0.2 antisamy-markup-formatter:2.7 apache-httpcomponents-client-4-api:4.5.13-138.v4e7d9a_7b_a_e61 authentication-tokens:1.4 authorize-project:1.4.0 aws-credentials:191.vcb_f183ce58b_9 aws-java-sdk-ec2:1.12.287-357.vf82d85a_6eefd aws-java-sdk-minimal:1.12.287-357.vf82d85a_6eefd badge:1.9.1 bootstrap4-api:4.6.0-5 bootstrap5-api:5.2.1-2 bouncycastle-api:2.26 branch-api:2.1046.v0ca_37783ecc5 build-timeout:1.24 caffeine-api:2.9.3-65.v6a_47d0f4d1fe checks-api:1.7.5 cloud-stats:0.27 cloudbees-disk-usage-simple:170.va_fd5b_4ee6858 cloudbees-folder:6.758.vfd75d09eea_a_1 command-launcher:90.v669d7ccb_7c31 conditional-buildstep:1.4.2 config-file-provider:3.11.1 configuration-as-code:1512.vb_79d418d5fc8 credentials:1143.vb_e8b_b_ceee347 credentials-binding:523.vd859a_4b_122e6 description-setter:1.10 disk-usage:0.28 display-url-api:2.3.6 dumpling:2.2 durable-task:500.v8927d9fd99d8 ec2:2.0.2 echarts-api:5.3.3-1 email-ext:2.91 envinject:2.875.v9b_9e962da_a_ec envinject-api:1.199.v3ce31253ed13 extended-read-permission:3.2 favorite:2.4.1 flexible-publish:0.16.1 font-awesome-api:6.2.0-2 ghprb:1.42.2 git:4.12.1 git-client:3.12.0 git-server:99.va_0826a_b_cdfa_d github:1.35.0 github-api:1.303-400.v35c2d8258028 gitlab-plugin:1.5.35 greenballs:1.15.1 groovy:442.v817e6d937d6c groovy-postbuild:2.5 h2-api:1.4.199 handlebars:3.0.8 htmlpublisher:1.31 instance-identity:116.vf8f487400980 ionicons-api:24.va_745a_b_a_97fc3 jackson2-api:2.13.3-285.vc03c0256d517 jakarta-activation-api:2.0.1-1 jakarta-mail-api:2.0.1-1 javadoc:226.v71211feb_e7e9 javax-activation-api:1.2.0-4 javax-mail-api:1.6.2-5 jaxb:2.3.6-1 jdk-tool:55.v1b_32b_6ca_f9ca jersey2-api:2.37-1 jms-messaging:1.1.27 job-dsl:1.81 jobConfigHistory:1176.v1b_4290db_41a_5 jquery3-api:3.6.1-1 jsch:0.1.55.61.va_e9ee26616e7 junit:1119.1121.vc43d0fc45561 kerberos-sso:1.10 kubernetes:3706.vdfb_d599579f3 kubernetes-client-api:5.12.2-193.v26a_6078f65a_9 kubernetes-credentials:0.9.0 ldap:2.12 lockable-resources:2.16 mailer:438.v02c7f0a_12fa_4 matrix-auth:3.1.5 matrix-project:785.v06b_7f47b_c631 matrix-reloaded:1.1.3 metrics:4.2.10-389.v93143621b_050 mina-sshd-api-common:2.9.1-44.v476733c11f82 mina-sshd-api-core:2.9.1-44.v476733c11f82 momentjs:1.1.1 monitoring:1.91.0 naginator:1.18.1 node-iterator-api:49.v58a_8b_35f8363 okhttp-api:4.9.3-108.v0feda04578cf openstack-cloud:2.63 parameterized-scheduler:1.1 parameterized-trigger:2.45 pipeline-build-step:2.18 pipeline-github-lib:38.v445716ea_edda_ pipeline-graph-analysis:195.v5812d95a_a_2f9 pipeline-groovy-lib:612.v84da_9c54906d pipeline-input-step:451.vf1a_a_4f405289 pipeline-maven:1205.vceea_7b_972817 pipeline-milestone-step:101.vd572fef9d926 pipeline-model-api:2.2114.v2654ca_721309 pipeline-model-definition:2.2114.v2654ca_721309 pipeline-model-extensions:2.2114.v2654ca_721309 pipeline-rest-api:2.24 pipeline-stage-step:296.v5f6908f017a_5 pipeline-stage-tags-metadata:2.2114.v2654ca_721309 pipeline-stage-view:2.24 pipeline-utility-steps:2.13.0 plain-credentials:139.ved2b_9cf7587b plugin-util-api:2.17.0 popper-api:1.16.1-3 popper2-api:2.11.6-1 postbuildscript:3.2.0-460.va_fda_0fa_26720 proc-cleaner-plugin:2.4 prometheus:2.0.11 rebuild:1.34 redhat-ci-plugin:1.5.16 resource-disposer:0.20 role-strategy:561.v9846c7351a_41 run-condition:1.5 scm-api:621.vda_a_b_055e58f7 script-security:1183.v774b_0b_0a_a_451 short-workspace-path:0.3 snakeyaml-api:1.31-84.ve43da_fb_49d0b ssh-credentials:305.v8f4381501156 ssh-slaves:2.846.v1b_70190624f5 sshd:3.249.v2dc2ea_416e33 structs:324.va_f5d6774f3a_d swarm:3.34 test-stability:2.3 testng-plugin:555.va0d5f66521e3 text-finder:1.20 timestamper:1.20 token-macro:308.v4f2b_ed62b_b_16 trilead-api:2.72.v2a_3236754f73 update-sites-manager:2.0.0 variant:59.vf075fe829ccb windows-slaves:1.8.1 workflow-aggregator:590.v6a_d052e5a_a_b_5 workflow-api:1200.v8005c684b_a_c6 workflow-basic-steps:994.vd57e3ca_46d24 workflow-cps:2759.v87459c4eea_ca_ workflow-cps-global-lib:588.v576c103a_ff86 workflow-durable-task-step:1199.v02b_9244f8064 workflow-job:1236.vc3a_d1602f439 workflow-multibranch:716.vc692a_e52371b_ workflow-scm-step:400.v6b_89a_1317c9a_ workflow-step-api:639.v6eca_cd8c04a_a_ workflow-support:838.va_3a_087b_4055b ws-cleanup:0.43 zmq-event-publisher:0.0.5 ```What Operating System are you using (both controller, and any agents involved in the problem)?
agent - Windows Server 2019 (CYGWIN_NT-10.0-17763)
Reproduction steps
Provision Windows nodes again and again until you hit the problem.
Expected Results
One of:
SSHLauncher
should automatically re-launchJCloudsLauncher
on Windows (re-try mechanism)Actual Results
Sometimes ("randomly") provisioning fails and node is stuck in state "launch", in cloud stats I can see 13 hr and counting (then I terminated node manually), however there is no issue with the agent. I can access node from my computer with
ssh
and there are no logs in/remoting
(the directory does not exist, but remoting.jar is there). Using "relaunch agent" button does not help.Here is edited log:
Anything else?
No response