yandex-qatools / teamcity-openstack-plugin

Teamcity plugin to add openstack integration
Other
25 stars 24 forks source link

Catch exception and forget instance when trying to update status failed #44

Closed undimmable closed 4 years ago

undimmable commented 4 years ago

There's an NPE inside the nova library, which in the long run leads to leaking instances. Leaked instances, however, are still getting counted in limits. This commit fixes this issue.

axel3rd commented 4 years ago

Hi @dimyriy, many thanks for this pull request !

Off chance, did you know a manual scenario to reproduce the problem on Openstack platform ? Having some unit test to verify the fix is always better, but perhaps tricky in this case.

axel3rd commented 4 years ago

@dimyriy : Did you have the complete stacktrace of this NPE ?

Maybe a nicer solution would be to manage this case of error directly (Using ERROR status) in:

https://github.com/yandex-qatools/teamcity-openstack-plugin/blob/c56f7989c640b950fd2de5e62663941f89c9cc1e/cloud-openstack-server/src/main/java/jetbrains/buildServer/clouds/openstack/OpenstackCloudInstance.java#L63

WDYT ?

promiteyka commented 4 years ago

Hello @axel3rd, We cannot reproduce this problem manually. We have two versions: openstack 10 and 15 and the problem is reproduced on both, but only from the Teamcity. The problem is not reproduced very often, about once every 1-2 days. To solve it, we have to restart the server. I installed an updated version to the production server, it's okay for now. Still observe.

axel3rd commented 4 years ago

The problem is not reproduced very often, about once every 1-2 days

@promiteyka: Did you have in the TeamCity historical logs the Java stacktrace of this NPE ? To see the problem in Noca client. Even if problem is not reproducible manually, a mocked unit test could verify this use case (I could write it, but I don't know what I should reproduce). Many thanks

promiteyka commented 4 years ago

Stacktrace:

jetbrains/buildServer/clouds/openstack/OpenstackCloudInstance.java:66
java.lang.NullPointerException: id
at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:787)
at org.jclouds.openstack.v2_0.domain.Resource.<init>(Resource.java:110)
at sun.reflect.GeneratedConstructorAccessor192.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.google.common.reflect.Invokable$ConstructorInvokable.invokeInternal(Invokable.java:252)
at com.google.common.reflect.Invokable.invoke(Invokable.java:100)
at org.jclouds.json.internal.DeserializationConstructorAndReflectiveTypeAdapterFactory$DeserializeIntoParameterizedConstructor.newInstance(DeserializationConstructorAndReflectiveTypeAdapterFactory.java:225)
at org.jclouds.json.internal.DeserializationConstructorAndReflectiveTypeAdapterFactory$DeserializeIntoParameterizedConstructor.read(DeserializationConstructorAndReflectiveTypeAdapterFactory.java:205)
at org.jclouds.json.internal.DeserializationConstructorAndReflectiveTypeAdapterFactory$ParameterReader.read(DeserializationConstructorAndReflectiveTypeAdapterFactory.java:273)
at org.jclouds.json.internal.DeserializationConstructorAndReflectiveTypeAdapterFactory$DeserializeIntoParameterizedConstructor.read(DeserializationConstructorAndReflectiveTypeAdapterFactory.java:185)
at com.google.gson.Gson.fromJson(Gson.java:814)
at com.google.gson.Gson.fromJson(Gson.java:879)
at com.google.gson.Gson$1.deserialize(Gson.java:129)
at org.jclouds.openstack.nova.v2_0.config.NovaParserModule$ServerAdapter.deserialize(NovaParserModule.java:138)
at org.jclouds.openstack.nova.v2_0.config.NovaParserModule$ServerAdapter.deserialize(NovaParserModule.java:129)
at com.google.gson.TreeTypeAdapter.read(TreeTypeAdapter.java:58)
at com.google.gson.Gson.fromJson(Gson.java:814)
at org.jclouds.http.functions.ParseFirstJsonValueNamed.apply(ParseFirstJsonValueNamed.java:80)
at org.jclouds.http.functions.ParseFirstJsonValueNamed.apply(ParseFirstJsonValueNamed.java:44)
at org.jclouds.rest.internal.InvokeHttpMethod.invoke(InvokeHttpMethod.java:91)
at org.jclouds.rest.internal.InvokeHttpMethod.apply(InvokeHttpMethod.java:74)
at org.jclouds.rest.internal.InvokeHttpMethod.apply(InvokeHttpMethod.java:45)
at org.jclouds.reflect.FunctionalReflection$FunctionalInvocationHandler.handleInvocation(FunctionalReflection.java:117)
at com.google.common.reflect.AbstractInvocationHandler.invoke(AbstractInvocationHandler.java:84)
at com.sun.proxy.$Proxy329.get(Unknown Source)
at jetbrains.buildServer.clouds.openstack.OpenstackCloudInstance.updateStatus(OpenstackCloudInstance.java:66)
at jetbrains.buildServer.clouds.openstack.OpenstackCloudImage.lambda$new$0(OpenstackCloudImage.java:74)
at com.jcabi.log.VerboseRunnable.run(VerboseRunnable.java:190)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
undimmable commented 4 years ago

@dimyriy : Did you have the complete stacktrace of this NPE ?

Maybe a nicer solution would be to manage this case of error directly (Using ERROR status) in:

https://github.com/yandex-qatools/teamcity-openstack-plugin/blob/c56f7989c640b950fd2de5e62663941f89c9cc1e/cloud-openstack-server/src/main/java/jetbrains/buildServer/clouds/openstack/OpenstackCloudInstance.java#L63

WDYT ?

You're right. I've addressed this in the new commit https://github.com/yandex-qatools/teamcity-openstack-plugin/pull/44/commits/68bcd0873ecf16e3bc9f8e9106536af2d0e8e18b

axel3rd commented 4 years ago

In addition of PR review :

1) Doing a git squash of the N commits of this PR could be nice. 2) The NPE error can be reproduced by copy/paste of v2.1-nova-id-servers-server-id-run.json file in v2.1-nova-id-servers-server-id-not-defined.json and remove the id (line 28). In addition of this new unit test in OpenstackCloudClientMockedTest :

    @Test
    public void testErrorClientNovaNPEOnUpdateStatus() throws Exception {
        initVMStart();

        // OpenStack response with a content without 'id' => will throw NPE on update status
        stubFor(get("/v2.1/nova-id/servers/server-id").willReturn(aResponse().withBodyFile("v2.1-nova-id-servers-server-id-not-defined.json")));

        // Termination due to NPE + unit 'testSubSimple' termination
        stubFor(delete("/v2.1/nova-id/servers/server-id").willReturn(aResponse().withStatus(500)));

        // unit 'testDubSimple' termination
        stubFor(post("/v2.1/nova-id/servers/server-id/action").willReturn(aResponse().withStatus(500)));

        testSubSimple(wireMockServer.baseUrl() + "/v3", "default:my-tenant:ldap:foo", "bar", "region1", getTestYaml("Mock"), true, true);

        // Test termination call after NPE
        verify(deleteRequestedFor(urlMatching("/v2.1/nova-id/servers/server-id")));
    }

It could allow to reach a green quality gate 😁.

undimmable commented 4 years ago

@axel3rd, I squashed the commits and merged your pull request into the fork.

promiteyka commented 4 years ago

@axel3rd , hello. We have been using the snapshot version for more than a week, everything works as expected. Can you merge this pull request into master and create release version?

axel3rd commented 4 years ago

Many thanks for works & tests ! I will do merge & PR next week (fully busy until this time)

Le mar. 11 févr. 2020 à 20:29, promiteyka notifications@github.com a écrit :

@axel3rd https://github.com/axel3rd , hello. We have been using the snapshot version for more than a week, everything works as expected. Can you merge this pull request into master and create release version?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/yandex-qatools/teamcity-openstack-plugin/pull/44?email_source=notifications&email_token=AAWHBOL23IUEP2GGAGNFHQDRCL4CNA5CNFSM4KNIZVQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELNX4CA#issuecomment-584809992, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAWHBOKUKRJLNRVRCCB7XY3RCL4CNANCNFSM4KNIZVQQ .

axel3rd commented 4 years ago

Can you merge this pull request into master and create release version?

@promiteyka / @dimyriy : v1.2 released

promiteyka commented 4 years ago

@axel3rd thank you so much

undimmable commented 4 years ago

Thank you!