owid / owid-grapher

A platform for creating interactive data visualizations
https://ourworldindata.org
MIT License
1.36k stars 228 forks source link

Investigate Cloudflare Images as an alternative to our `bakeDriveImages` baking step #2485

Open ikesau opened 1 year ago

ikesau commented 1 year ago

We currently bake gdocs images by pulling down the original versions from S3, resizing them multiple times, and uploading all the images to Netlify.

This is an inefficient process because we do the resizing every single time, instead of caching the efforts anywhere which is protracting our deploy times.

But before we improve that, we'd like to check and see if Cloudflare Images would be a better way to solve the entire problem.

This issue is an invitation for a team member to mess around with Cloudflare Images and try to answer the following questions:

If I'm not the one to do it, feel free to reach out to me to ask questions about the current baking process 👍

Update 2024-03-12: At this point this is mostly a nice to have for us devs because the image workflow is pretty complicated and occasionally breaks in annoying ways. #3199 was a motivation to reopen this issue and we still think that this and the simplification of the codebase would be worth a few days of dev time.

ikesau commented 11 months ago

Update: Marcel has done some tentative exploration into the service.

Results are promising, though one caveat is that SVG tags get all metadata stripped, which, given that we want all our new static charts to be SVG with embedded webfonts (#2518), precludes CFI from being our be-all and end-all solution for all image hosting.

We would still want to use it for legacy images, the team page, etc.

If we discover that we can't use SVGs at all, then we could use it for everything.

larsyencken commented 8 months ago

Closing this for now, we can still investigate further later.

ikesau commented 2 months ago

We could do some nice things like content graph integration and GPT alt text if we started uploading images via the admin instead.

Technically these features could be implemented with Google Drive, but it would be more complex and annoying. Google Drive doesn't allow us to configure an event handler for file uploads (only per-minute polling), so the flows would become more complex for little gain.

We should evaluate all the important user stories for the image workflow to make sure the admin implementation handles them nicely.

Here are some that I can think of:

ikesau commented 1 month ago

Here's a datasette query showing all our posts with featured images with aspect ratios that are more than 0.1 off our target of 1.9

Not required for this feature, but it would be nice if we could design it in such a way that bulk updating all these images to ones with correct aspect ratios would be possible.