yandex-qatools / teamcity-openstack-plugin

Teamcity plugin to add openstack integration
Other
25 stars 24 forks source link

Agents are removed fom pool after TeamCity server start #65

Open KabarukhinAlexey opened 3 years ago

KabarukhinAlexey commented 3 years ago

Hello. Discovered that after upgrading teamcity-openstack-plugin from v1.3 to v1.5 or to 1.6-SNAPSHOT agents are removed immediately after the start. Tested on TeamCity versions 2020.2 and 2020.2.4. Logs from teamcity-server:

[2021-06-17 19:48:14,691]   INFO [nio-8111-exec-2] - .instances.StartInstanceAction - Starting cloud instance: profile 'RHEL7'{id=NOVA-1, projectId=_Root}, jetbrains.buildServer.clouds.openstack.OpenstackCloudImage@446662af, hash=Ql9OJMS1N0jbVqjSkKCJ3ylx0QAhBneC, reason=User Administrator started agent from web UI
[2021-06-17 19:48:14,691]   INFO [nio-8111-exec-2] -  jetbrains.buildServer.clouds. - Starting cloud openstack instance RHEL7-1623948494691
[2021-06-17 19:48:14,691]   INFO [nio-8111-exec-2] - .server.impl.CloudEventsLogger - Cloud instance entered 'scheduled to start' state, profile 'RHEL7'{id=NOVA-1, projectId=_Root}, jetbrains.buildServer.clouds.openstack.OpenstackCloudInstance@532dd36f
[2021-06-17 19:48:14,692]   INFO [nio-8111-exec-2] - .server.impl.CloudEventsLogger - Cloud instance start succeeded: profile 'RHEL7'{id=NOVA-1, projectId=_Root}, jetbrains.buildServer.clouds.openstack.OpenstackCloudInstance@532dd36f
[2021-06-17 19:48:16,060]   INFO [uled executor 3] - .server.impl.CloudEventsLogger - Cloud instance entered 'starting' state, profile 'RHEL7'{id=NOVA-1, projectId=_Root}, jetbrains.buildServer.clouds.openstack.OpenstackCloudInstance@532dd36f
[2021-06-17 19:48:21,945]   INFO [enstack-RHEL7 1] -  jetbrains.buildServer.clouds. - Terminating cloud openstack instance RHEL7-1623948494691
[2021-06-17 19:48:22,281]  ERROR [enstack-RHEL7 1] -  jetbrains.buildServer.clouds. - Status cannot be found for instance (so terminated): RHEL7-1623948494691
jetbrains.buildServer.clouds.openstack.OpenstackException: Status cannot be found for instance (so terminated): RHEL7-1623948494691
        at jetbrains.buildServer.clouds.openstack.OpenstackCloudInstance.updateStatus(OpenstackCloudInstance.java:85)
        at jetbrains.buildServer.clouds.openstack.OpenstackCloudImage.lambda$new$1(OpenstackCloudImage.java:105)
        at com.jcabi.log.VerboseRunnable.run(VerboseRunnable.java:190)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
[2021-06-17 19:48:26,061]   INFO [uled executor 3] - .server.impl.CloudEventsLogger - Cloud instance has gone (is not reported by cloud profile): profileId=NOVA-1, imageId=RHEL7, instanceId=1623948494691
axel3rd commented 3 years ago

This problem is probably due to #53, introduced in v1.4 ; ie: one request for global status which replace individual requests for each VM. Seems same problem than #62, if status API cannot be reached, the VM is removed. You can test the cloud-openstack.zip which introduce the "unknow" status ... but it will probably not fixes the problem completely.

Are you sure than the account used in cloud profile has permission to use servers/detail API ?

axel3rd commented 3 years ago

A test to validate the current credentials on servers/detail was added (8cd251c), but hard todo better.

axel3rd commented 3 years ago

Snapshot version: cloud-openstack.zip (v1.6).

axel3rd commented 3 years ago

Release v1.6 done: cloud-openstack.zip

axel3rd commented 2 years ago

Discovered that after upgrading teamcity-openstack-plugin from v1.3 to v1.5 or to https://github.com/yandex-qatools/teamcity-openstack-plugin/issues/62#issuecomment-856884357 agents are removed immediately after the start.

Good catch. It occurs after a TeamCity server start, due to :

https://github.com/yandex-qatools/teamcity-openstack-plugin/blob/d39f8cec2616ccdf868bd0dab847c032b1b34549/cloud-openstack-server/src/main/java/jetbrains/buildServer/clouds/openstack/OpenstackCloudImage.java#L247-L250

The image is "re-affected" to the default pool