haiku / infrastructure

Haiku infrastructure as code
https://hub.docker.com/r/haiku
MIT License
17 stars 16 forks source link

Choose object storage provider for haikuports #141

Open kallisti5 opened 1 month ago

kallisti5 commented 1 month ago

With haikuporter's support of s3, we need to choose a object storage provider. For context, this will be replacing our Digital Ocean volume block attachment which is $25 month / 250GiB

Assuming ~400GiB stored... 2TiB of egress a month (which gives us a lot of head room) Assuming 35 million API ops a month (17M Class A, 17M Class B)

Notes: We don't have to go all-in on a single S3 provider. Haiku can remain at wasabi, haikuports can be "where ever". We can run one deployment of hpkgbouncer per repo.

kallisti5 commented 1 month ago

My preference:

  1. Storj.io - Cheap, does some cool things being based on storj. Would need a CDN for egress.
  2. Backblaze B2 - not the cheapest.. but if we grow we can toss a reasonably priced CDN in front of it and get free egress (minus how much the CDN costs... and bunny.net is cheap)
  3. Wasabi - Continue with Wasabi... they have served us well. However, we should likely throw a CDN in front of them to reduce risk of getting cut off.
  4. Digital Ocean - Risks being expensive, but can't beat "local" object storage access (which will hopefully cut down on bandwidth usage)
  5. Telnyx - Cheapest storage at scale... we just have to be mindful of API transaction limits from hpkgbouncer and haikuporter. New provider though.
kallisti5 commented 1 month ago

I've reached out to wasabi to try and get "actual" bandwidth utilization numbers. They don't publish it in our portal (but I sure as hell know they look at it since they have cut us off before due to egress)

waddlesplash commented 1 month ago

Why do you have Backblaze as "20-25 a month"? If we factor in the CDN with free egress then shouldn't it be storage costs only, and thus be equivalent to Wasabi + CDN?

kallisti5 commented 1 month ago

Why do you have Backblaze as "20-25 a month"? If we factor in the CDN with free egress then shouldn't it be storage costs only, and thus be equivalent to Wasabi + CDN?

"Free egress up to 3x their average monthly storage amount. Egress over average stored is $0.01/GiB. ~2TiB - 400GiB = 1600GiB * 0.01 = $16 / month" 16 + 6 = $21~

EDIT: I did that math wrong. Lets re-run the cost numbers. Assuming 6TiB egress, and 400GiB storage.

Backblaze:

Storj:

Wasabi:

Telnyx:

Backblaze + bunny.net CDN seems like the best deal tbh with controlled risk. The Bunny.net CDN could cut that 6TiB way down to a "a few TiB or less" on all providers, but it's an unknown how efficient their caching is in our use-case

kallisti5 commented 1 month ago

EDIT - Actual worst-case egress bandwidth numbers:

nielx commented 1 month ago

For me, while I think reliability is important, it is not the end of the world if we get cut off and need to relocate. However, how do we keep control of our packages? I.e. is there going to be a backup or a primary source for them? Another factor is the odds of hidden surprises, i.e. I do not want to be surprised by a sudden change of rates if we cross some sort of threshold, so any provider with a 'flat' rate that scales linearly is preferred over a provider that requires us to closely monitor some sort of threshold. Finally, I would keep things as easy as possible, so the Digital Ocean Spaces where we will need to do additional data design is off the table for me.

kallisti5 commented 1 month ago

For me, while I think reliability is important, it is not the end of the world if we get cut off and need to relocate. However, how do we keep control of our packages? I.e. is there going to be a backup or a primary source for them?

The nice thing about s3 is it actually gets easier to back things up. Today we have the automatic "compress all the artifacts, encrypt them, and upload to an s3 bucket" backup system. That doesn't work for huge things though since I really don't want to work with 300GiB tar delta's :sweat_smile:

In the model where some object storage provider is the source of truth, we really just need to rclone the bucket "somewhere" else. Historically i've just rcloned to a dedicated bit of local storage at my house as a cold backup (you could do the same). rclone works off of deltas like rsync, so it's bandwidth consumption friendly after the initial clone.

rclone also lets you sync between storage providers... and it supports a TON

We actually have an rclone container today ready to go that will do that to storj. We can make some fixes though to make it more generic.

I also have rclonefs which will (theoretically) let us mount s3 buckets as fuse storage mounts on each k8s node so we can (theoretically) offer s3 buckets over rsync to mirrors from pods running on any k8s node. (fuse in k8s is weird though, and we need elevated security context).

Another factor is the odds of hidden surprises, i.e. I do not want to be surprised by a sudden change of rates if we cross some sort of threshold, so any provider with a 'flat' rate that scales linearly is preferred over a provider that requires us to closely monitor some sort of threshold.

Agree. Definitely the biggest pain point of object storage. I really like the pricing of Telnyx, but the whole "per million API hits" thing makes me nervous on something complex and large like haikuports.

Finally, I would keep things as easy as possible, so the Digital Ocean Spaces where we will need to do additional data design is off the table for me.

Agree. Lets strike DO off the list. They had some appealing things to them, but needing a whole gaggle of buckets to groom to get reasonable pricing is too much lift. I'm tired of forming infrastructure "around" providers weird limitations.

kallisti5 commented 1 month ago

I updated https://github.com/haiku/infrastructure/issues/141#issuecomment-2383633257 with the pricing based on the actual worst case bandwidth numbers I saw on digital ocean.

kallisti5 commented 1 month ago

Oh, and I just looked at the Wasabi bill.. it does list "908.40 API requests" for the month. I'm guessing that's 1000's though given the decimal point.. so 908,400 makes more sense.

kallisti5 commented 1 month ago

Here's some data on the bunny.net cdn. It definitely cuts down our bandwidth usage ~50% on a single haiku nightly repo.

bunny

I'm sure the savings will be less for haikuports (more random packages, etc)

nielx commented 1 month ago

Looks like the preferred is backblaze + bunny then?

kallisti5 commented 1 month ago

Agree. I think backblaze + bunny are going to be the cheapest combo. Bunny will cut down the xfer 50%, so that $30 / month should be "worst case"

kallisti5 commented 1 month ago

Ryan went ahead and entered our billing info. I went ahead and deployed a temporary VM @ digital ocean to use to shovel artifacts over to backblaze.

I'm going to start with the Haiku repos themselves since it's an easy (smaller) test of data before moving on to haikuports.

kallisti5 commented 1 month ago

~Aaaand.. Backblaze just crapped the bed.~

Screenshot From 2024-10-06 20-05-20

API calls are NOT free.

EDIT: I guess +$4-8 a month extra for API calls isn't horrible... however it adds risk to Backblaze.

nielx commented 1 month ago

API calls are NOT free.

EDIT: I guess +$4-8 a month extra for API calls isn't horrible... however it adds risk to Backblaze.

That's not great and definitely false advertising...

kallisti5 commented 1 month ago

I went ahead and put the haiku repo over onto backblaze. We already blew past the "free tier" of class C api calls during the last sync. :face_exhaling:

I'm about to head out of town and will be back Sunday.. so here are important facts:

If the :hankey: hits the fan, you can take the following actions to undo the migration to backblaze: