OpenBeta / open-tacos

Rock climbing route catalog (openbeta.io)
https://openbeta.io
GNU Affero General Public License v3.0
131 stars 116 forks source link

Create a cloud function to resize photos on the fly #944

Closed vnugent closed 10 months ago

vnugent commented 1 year ago

What would you like to be able to do? Please describe. We're currently serving photos on openbeta.io from Google storage in full resolution. Create a Google cloud run function that can resize photos on the fly.

How important is this to you (Please pick one)

See

ccabanero commented 1 year ago

Hello @vnugent - just following up on some requirements:

vnugent commented 1 year ago

What is the preferred runtime/language? For example, Python vs. JavaScript/NodeJS

I don't have a strong preference. It'd be great in Node (less context switching for the volunteers to maintain), but if there's a more performant way in another language we could go with that.

Should the cloud function also be invoked on other events vs. HTTP requests from the API Gateway? For example, when an image object is successfully created in Google Cloud Storage (i.e. google.cloud.storage.object.v1.finalzed event), should this cloud function be automatically invoked?

Is it something we can add later? For now I think we can focus on resizing on-demand. What do you think?

ccabanero commented 1 year ago

All sounds good. I will work through the contributing documentation and pursue any further questions through the team Discord. Thanks for the reply.

vnugent commented 1 year ago

@ccabanero whenever you're ready, email me (viet @ openbeta dot io) I'll need to give you access to our project on Google Cloud.

Ugzuzg commented 1 year ago

Can use sharp for image processing: https://sharp.pixelplumbing.com/

vnugent commented 1 year ago

Can use sharp for image processing: https://sharp.pixelplumbing.com/

As long as it performs well for our application I don't have a strong preference.

enapupe commented 12 months ago

FWIW I've done something very similar in the past, we "abused" Cloudflare cache on this endpoint so the cloud function would never have to reprocess the same image again. This saved a lot of processing and made things quicker. sharp seems to be the best option indeed, webp seems a good image format choice nowadays BTW.

vnugent commented 11 months ago

A few people have expressed interest working on this, but it's still up for grabs. @enapupe can you help?

enapupe commented 11 months ago

I don't have experience with Google Cloud Functions, only AWS Lambda (used in my everyday job). However, I think this would not be an issue if I used a framework like the serverless framework .

I'd just need access to the GC to set it up. Also whatever else needed for setting up an actual endpoint (api gateway, dns, etc.

How is this service gonna access the Google storage content? edit: also, what about CI and CD? do we have it or just deploy from local machine for now?

vnugent commented 11 months ago

Send me an email viet at openbeta dot io and I will give you access to GCloud. I think GCloud functions is similar to AWS Lambda.

How is this service gonna access the Google storage content?

We build the back end with GH action but deployment is manual. Frontend CI/CD is automated using Vercel.com.

vnugent commented 10 months ago

Thanks @enapupe for working on this.

enapupe commented 4 months ago

For the record, if in the future our google cloud bill for the media server gets too high or the service is perceived as slow:

It seems CloudFlare now has a dedicated plan/service to handle this sort of things, before we implemented the new media server, any image transformations required at least the Pro plan ($20/mo), it's now an independently billed service, without any upfront cost.

You mentioned our bill for google cloud compute media server is currently around $16/mo. There's a high chance this new CF business model being cheaper/faster than what we have implemented today. The main differences are the processing speed (CF seems to work 2x faster) and the caching is also warmer than ours with the Free plan, which invalidates quite often - generating more google cloud expenses.

To test it, we would have to: 1) enable it in CF (it costs $0 upfront) 2) change the frontend to point to the new URL 3) let it run for a month and check the price differences

then, if we want 4) do a bit more setup and use the same media.openbeta.io CNAME (subdomain) and code a tiny worker to keep the exact same API as we have today

vnugent commented 4 months ago

@enapupe is this something you could help with? FYI we're currently using CF R2 for the crag map tiles (cost is more competitive than s3)

enapupe commented 4 months ago

Yeah I could definitely help with that. Do you think now it's the right time to make this move? Regarding actually making a noticable difference right now.

On Sat, May 11, 2024, 01:38 Viet Nguyen @.***> wrote:

@enapupe https://github.com/enapupe is this something you could help with? FYI we're currently using CF R2 for the crag map tiles https://openbeta.io/maps (cost is more competitive than s3)

— Reply to this email directly, view it on GitHub https://github.com/OpenBeta/open-tacos/issues/944#issuecomment-2105542911, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACHCCSER3KOLC5LNAEEJY3ZBWOD5AVCNFSM6AAAAAA224SH6KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBVGU2DEOJRGE . You are receiving this because you were mentioned.Message ID: @.***>

vnugent commented 4 months ago

Let's go for it. The cost saving (if it's indeed cheaper in the long run) is worth it. I think you still have CF access? If not, please ping me on Discord.

enapupe commented 4 months ago

Ideally we should check the access log for the media server and calculate how many "variations" we have per month, so we can safely invest time in this change. However I'm not sure if such logs exist, maybe cloudflare has some?

It seems the pricing is $1 for 2k transformations. This means we could "generate" 32k image variations and still be in the same ~$16/mo. Do you know how many images we currently have and or a feeling for how many are accessed in a given month? To be clear, the image access itself is free, the variation generation is the only thing that costs.

I have very limited access to CF right now.