machulav / ec2-github-runner

On-demand self-hosted AWS EC2 runner for GitHub Actions
MIT License
747 stars 337 forks source link

feat: support for using spot instances #171

Open troinine opened 1 year ago

troinine commented 1 year ago

Closes #169

This PR adds an optional variable to request EC2 Spot instances which are significantly more cost-effective for running non-critical workloads compared on-demand pricing.

The default functionality remains the same. If you do not define market-type, you get on-demand instances. Tested to work on an actual AWS account:

Screenshot 2023-11-24 at 11 24 30

Start:

Run troinine/ec2-github-runner@feature/support-spot-instances
  with:
    mode: start
    github-token: -- redacted --
    ec2-image-id: -- redacted --
    ec2-instance-type: t3a.large
    subnet-id: -- redacted --
    security-group-id: -- redacted --
    market-type: spot
    aws-resource-tags: -- redacted --

GitHub Registration Token is received
AWS EC2 instance i-0b010... is started
AWS EC2 instance i-0b010... is up and running

Stop:

Run troinine/ec2-github-runner@feature/support-spot-instances
  with:
    mode: stop
    github-token: -- redacted --
    label: -- redacted --
    ec2-instance-id: i-0b010...
    aws-resource-tags: -- redacted --

AWS EC2 instance i-0b010... is terminated

P.S. I have some future improvement ideas like reverting to on-demand capacity if the spot request cannot be fulfilled but I wanted to keep it simple for now, especially since there are no tests in the project (which are another area of improvement).

tknisch commented 1 year ago

I've tested the new feature and it worked fine: image

troinine commented 1 year ago

@machulav, kindly requesting your opinion on this. Would it be possible to merge and release this feature?

Dmitry1987 commented 11 months ago

great work! spot requests are really missing for the one-off builds, thanks for working on this @troinine . I think actions like this one should be adopted by aws official or by github official repos 🧐 because it's kind of a must have functionality for either of these companies.

Dmitry1987 commented 11 months ago

What about the bid price though, is it automatic on aws side when that function is called, and uses the current market price? Has a bit more chance of termination for long builds then, isn't it? if ec2 gets quickly outbid by other requests.

troinine commented 11 months ago

What about the bid price though, is it automatic on aws side when that function is called, and uses the current market price? Has a bit more chance of termination for long builds then, isn't it? if ec2 gets quickly outbid by other requests.

Good point. Initially, I allowed for a configurable bid price, but after considering some experiences and reviewing the AWS documentation, I reverted to using defaults. Not defining a bid price gives the best chance of running the instance uninterrupted.

According to the RunInstances API documentation:

We do not recommend using this parameter because it can lead to increased interruptions.

Dmitry1987 commented 11 months ago

I wonder if the linter status is a blocker for it to get merged 🧐 can i help somehow push it through? 😅 i want to use the main release so waiting for it to get merged. btw there's this "cpu options" to make "threads_per_core=1", it seems not available in current options. maybe we should allow an array of all possible aws sdk options to be passed as a whole? for example instead of adding options and mapping them to a new yaml field, like the spot and market

      MarketType: config.input.marketType,
      SpotOptions: {
        SpotInstanceType: 'one-time',
      },

it could be a good option to let the user chose any 'input' fields to append, like 'extraOptions:' and whatever goes there. because there are so many options to the ec2 spawn operation. storage mapping, networking stuff, launch template to use, resource group, etc. what do you think?

Dmitry1987 commented 10 months ago

Hi all hope you're having a good new year 🎉 I merged this to my fork and noticed that runner won't connect. How do we troubleshoot when this happens? there are not much logs and I have no idea what processes are running... it's the same exact AMI that is working with vanilla release of the action, so one of the two PRs I merged are affecting something, but I don't know how to approach debugging this 🤔

Building data script for linux
AWS EC2 instance i-xxxxx is started
AWS EC2 instance i-xxxxx is up and running
Waiting [30]s for the AWS EC2 instance to be registered in GitHub as a new self-hosted runner
Checking every 10s if the GitHub self-hosted runner is registered
Checking...
Checking...
Checking...
Checking...
Error: GitHub self-hosted runner registration error
Checking...
Error: A timeout of 5 minutes is exceeded. Your AWS EC2 instance was not able to register itself in GitHub as a new self-hosted runner.
tknisch commented 10 months ago

Hi all hope you're having a good new year 🎉 I merged this to my fork and noticed that runner won't connect. How do we troubleshoot when this happens? there are not much logs and I have no idea what processes are running... it's the same exact AMI that is working with vanilla release of the action, so one of the two PRs I merged are affecting something, but I don't know how to approach debugging this 🤔

Building data script for linux
AWS EC2 instance i-xxxxx is started
AWS EC2 instance i-xxxxx is up and running
Waiting [30]s for the AWS EC2 instance to be registered in GitHub as a new self-hosted runner
Checking every 10s if the GitHub self-hosted runner is registered
Checking...
Checking...
Checking...
Checking...
Error: GitHub self-hosted runner registration error
Checking...
Error: A timeout of 5 minutes is exceeded. Your AWS EC2 instance was not able to register itself in GitHub as a new self-hosted runner.

You could activate the Instance Termination Protection, right after the instance has launched. This will prohibit the action from terminating the instance. Then you will have enough time to connect to the instance (e.g. via SSH oder SSM agent) and have a look if the GitHub Runner is active

Dmitry1987 commented 10 months ago

basically user-data script had syntax errors, because I merged a mix of windows support and spots, while making a mess in that js file 😁 works nicely. i hope this gets merged.

mission-jseverance commented 9 months ago

Any idea when this could get merged?

Preen commented 8 months ago

+1

hassanazharkhan commented 8 months ago

@machulav Could you please prioritise your review on this one pls? it will be a great addition! 🙏🏻

crohr commented 6 months ago

I'm not sure whether this project is still maintained, but for anyone interested I recently launched RunsOn, which supports spot instances, arm64, compatible images, etc. and is backed by many clients so it won't go away anytime soon.

This project was an inspiration for that work, so hopefully it continues to live since it fills a specific use-case well.

Preen commented 3 weeks ago

@troinine can you update the dist/index.js and src/aws.js, want to test this out asap so that we can get this feature in.