jenkinsci / ec2-fleet-plugin

The EC2 Fleet plugin launches EC2 instances as worker nodes for Jenkins CI server, automatically scaling the capacity with the load.
https://plugins.jenkins.io/ec2-fleet/
Apache License 2.0
111 stars 81 forks source link

Support manually deleting nodes #99

Closed marklagendijk closed 4 years ago

marklagendijk commented 5 years ago

In Jenkins you can manually delete nodes by selecting a node and choosing 'Delete agent'. However, when you do this to any node created by this plugin, it will only delete the Jenkins node, not the actual Amazon node. Because of this the Jenkins node gets recreated directly after the deletion.

Would it be possible to also delete the Amazon nodes when the Jenkins node is deleted manually? We sometimes need to manually delete nodes. It would be handy if this could be done directly from Jenkins.

terma commented 5 years ago

interesting, could you please explain a little bit more, why do you need to delete node on AWS side, and clarify what is delete do you mean terminate EC2 instance from EC2 Spot Fleet? or reduce capacity for fleet?

marklagendijk commented 5 years ago

By delete I mean terminating the specific EC2 instance. Our current setup works as follows (if you have suggestions on how to improve that, those are also welcome):

  1. When a new AWS node is created, a Ansible playbook is run against the new node, via Rundeck. This playbook initializes the node by installing all required tools, and setting up their configuration.
  2. At the end of the playbook the SSH credentials for Jenkins are configured, so Jenkins can connect to it.

Our main reasons for manually deleting nodes are:

I think this feature would make sense, because the Delete agent option is there. Since it is there people will use it. And when they use I think they will expect the node to be terminated, instead of Jenkins re-adding the node straight away.

terma commented 5 years ago

I think we can split:

An issue occurred during the execution of the playbook. This leaves the node in an inconsistent state, and Jenkins can't connect to it.

That's good point to investigate, I don't have clear answer. As far as I know, when node is offline, Jenkins will wait some short amount of time a few minutes and will try to provision more nodes from plugin, so if you don't reach your max capacity plugin will get new node, however it's unclear if offline node will be terminated as idle code

converted to #102

We added a new feature to the playbook, and want to start using it straight away.

We can add button to Plugin configuration page to reset/delete existent capacity for fleet, so you will be able to propage updates for instances in one click without fleet cancellation.

converted to #101

I'm not sure if this is still an issue with the current plugin, but I believe it was in the past: the spot price went over our limit, and Amazon terminated our instances. The instances remain in Jenkins in a broken state.

Should not be case anymore as we describe fleet find terminated and remove from Jenkins code code

mihaiplesa commented 5 years ago

Also can't delete nodes by using kill(), doDoDelete() or other Java methods that work for Jenkins Amazon EC2 plugin.

terma commented 5 years ago

@mihaiplesa do you see some exception or just nothing? Don't forget that node deletion on Jenkins side, just remove Jenkins representation, however not an EC2 instance, so it could be the case that you successfully removed node by those methods, however plugin (every 10 seconds) recreate it because of it present in EC2 fleet.

mihaiplesa commented 5 years ago

@terma this is what the Jenkins AWS EC2 plugin does when node is deleted manually via Jenkins UI https://github.com/jenkinsci/ec2-plugin/blob/b9f877fc936286b6f818ed55e2fbf00460dbb1fa/src/main/java/hudson/plugins/ec2/EC2Computer.java#L150

terma commented 5 years ago

@mihaiplesa, ok, I think we can add it to ec2 fleet plugin. If you have pull request for it willl be happy to merge, or I can find some time.