datamade / how-to

📚 Doing all sorts of things, the DataMade way
MIT License
81 stars 12 forks source link

Github Actions Self-Hosted Runners #270

Closed fgregg closed 1 year ago

fgregg commented 2 years ago

Background

We've been having a great time using github actions as a scraping platform. But for intensive scrapes on private reps, the pricing is pretty unfavorable ($0.008/minute.).

We might be able to get the best of both worlds by setting up a self-hosted runner.

Here are some resources for setting some up:

Proposal

Try setting up a self-hosted runner that has much better pricing.

Deliverables

An approved approach for setting up self hosted runner or a decision that this is not worth doing.

Timeline

Two investment days.

aktech commented 2 years ago

2 cents on self-hosting runners:

I built a service for running self-hosted GitHub Action runners on cloud with 0 maintenance and free for open source: https://cirun.io/

It's on-demand, which means you only pay to your cloud provider for the time you're using it. All you need is a simple configuration .cirun.yml file, here is a demo. Also available on GitHub Marketplace as well: https://github.com/marketplace/cirun-io

fgregg commented 2 years ago

hi @aktech, what is the pricing for private repos?

aktech commented 2 years ago

Hey @fgregg It is free at the moment for private repositories as well, we're in the process of defining the pricing as of now. In the nutshell, it will be flat price per month based on the number of private repositories using it.

I noticed you're using it with DigitalOcean with runner being up for 15-20 minutes. I would not recommend using DigitalOcean for short time jobs, the reason being DigitalOcean doesn't have a per minute billing for example irrespective of the fact if you use it for 1 minute or one-hour you'll be charged the same. Other cloud providers have per minute billing.

fgregg commented 2 years ago

So i tried a few of the solutions on https://github.com/jonico/awesome-runners

the other runners listed were much less developed.

right now, github's own action runners are looking like the more economical path.

i think the last thing to try is to see if @aktech might be willing to consult on our needs and see if cirun.io might be a good fit.

aktech commented 2 years ago

i think the last thing to try is to see if @aktech might be willing to consult on our needs and see if cirun.io might be a good fit.

Hey @fgregg I would be happy to help, you can also use GCP or Azure with Cirun.io (If EC2 is blocked). Feel free to drop me a mail at amit@cirun.io to schedule a call.

fgregg commented 1 year ago

cirun + azure spot instances is looking very promising.

fgregg commented 1 year ago

with cirun + azure spot instances, this is worth doing. when i write the doc for #212, i will give directions on using these runners.