WordPress / openverse

Openverse is a search engine for openly-licensed media. This monorepo includes all application code.
https://openverse.org
MIT License
237 stars 187 forks source link

Cache translations from GlotPress for 24 hours in CI instead of pulling on every single CI run #4614

Open sarayourfriend opened 2 months ago

sarayourfriend commented 2 months ago

Current Situation

It's possible for our CI/CD suite to run enough times for us to get rate limited when downloading translations from GlotPress. See this run, for example:

https://github.com/WordPress/openverse/actions/runs/9937426887/job/27447855989?pr=4585#step:4:1007

We pull translations from GlotPress every time we build the frontend image, which happens any time frontend changes occur. We also pull it whenever storybook is built (because storybook build runs the build script, and prebuild runs i18n which fetches translations). Storybook is built for documentation site changes as well as relevant frontend changes.

For certain change sets, we could conceivably fetch translations from GlotPress four times:

Suggested Improvement

The easiest change to make would be to remove translation getting from storybook. We do not currently have any means of changing the language used to render components in storybook, as far as I can tell. The RTL/LTR toggle just changes the direction, but leaves the English language text—see this story, for example, which uses text from the translation files rather than hard-coded text from Storybook variables; there is no way to change the language! Refer to https://github.com/WordPress/openverse/issues/4621 the implementation of this "first step" which should unquestionably be implemented. This issue (4614), however, is for talking about caching the translations to reduce the overall necessary GlotPress interactions to a baseline of 1-per-day. Stopping Storybook unnecessarily pulling translations still pulls translations every single time the frontend builds in CI.

Realistically, there's no reason to pull translations so often. If we pulled translations in CI once every 24 hours (or a shorter set interval) and cached it in a shared artefact instead, it would speed up CI (even if only marginally), and prevent this problem from happening altogether, regardless of how active our repository is. A 24 hours SLA of translation changes is not unreasonable, and if there was a problem where we needed to pull translations right away and push a new build, we could easily manually dispatch the translation pull workflow and build a new frontend image.

Benefit

Prevent rate limiting to GlotPress and slightly speed up frontend (and storybook, so therefore also documentation site) build times in CI.

obulat commented 2 months ago

We have several stories with -rtl snapshots, but some only have it in their name, not even changing the HTML direction (VCollectionHeader). We do have stories with full RTL support (we need to add the language switcher to the story to do that): https://docs.openverse.org/storybook/?path=/story/components-vmediainfo-vmediareuse--default&globals=languageDirection:rtl

As for the issue of getting translations, I think we don't download the translations from GlotPress for Playwright tests. To run the tests we use the prod:playwright command that copies the test locales and then runs build:only.

https://github.com/WordPress/openverse/blob/90f87b8712b50cf05401de91eb5805b067efeb85/frontend/package.json#L21

I think that our CI setup makes us download the locales twice per a single build run because of this line:

https://github.com/WordPress/openverse/blob/90f87b8712b50cf05401de91eb5805b067efeb85/.github/workflows/ci_cd.yml#L179-L182

We download the translations, and then the build-img step runs docker build, which runs the pnpm build (doesn't it?) that re-downloads the translations.

sarayourfriend commented 2 months ago

@obulat I'm talking about the storybook tests, not the playwright tests. The playwright tests use the test locales and do not require the live translations at all.

obulat commented 2 months ago

I was also thinking about storybook tests, but didn't think about the different commands that are used in Playwright and Storybook (Playwright) tests.

Storybook tests re-use the Playwright configs, but they run the storybook build command, which under the hood uses the pnpm build to build the app. I don't know if it's possible to change what storybook build does (whether it uses the build or build:only command). Maybe we could set an env variable when running the Storybook Playwright tests, and then, the i18n scripts would check that variable and skip download if it's set?

sarayourfriend commented 2 months ago

Right, that would address the stop gap I described in the issue, just reducing the number of places we query for translations.

But really, I'd say that using pre/post scripts for this is a dark pattern that will always eventually lead to greater complexity and needing to fight them. Our build scripts should explicitly pull translations when they need them, not rely on a pre-script to do that. If we were going to do it right, we should remove these implicit steps and make them explicit when they are actually needed. It's a big change but certainly simplifies things. On the other hand, it will create conflicts and complexity for the Nuxt 3 PR.

I don't think and of that is a replacement for caching it altogether though. It's still conceivable that we would have many builds happening close together that we make a bunch of requests to GlotPress and get rate limited. Do you disagree that caching is a good idea? Just want to make sure I'm clear what you're getting at and why, in case the caching is something you don't think we should do and instead just go with what I've described in the issue as an easy first-step/stop-gap.

dhruvkb commented 2 months ago

Another way that doesn't involve caching is to download the translations in a job (that only runs if the frontend has any changes), and then export them as an artifact that other jobs can use. That way one CI run only downloads translations once, at max, and zero times if the frontend has not changed.

We can also request GlotPress to not increase the limits or not rate limit calls from GitHub Actions worker IPs but that would be a bit time-consuming and difficult.

sarayourfriend commented 2 months ago

Another way that doesn't involve caching is to download the translations in a job (that only runs if the frontend has any changes), and then export them as an artifact that other jobs can use. That way one CI run only downloads translations once, at max, and zero times if the frontend has not changed.

That's not really necessary or an overall improvement if taken in consideration with the changes to remove downloading translations from storybook, it's the only place other than the frontend build that pulls from GlotPress.

As mentioned in the issue body, removing them from storybook is at least cutting the total possible number to 1 rather than 4 for a single CI run.

Is caching the translations a bad idea, something we shouldn't do, for some reason? We could stop pulling translations for builds not on main (branch and PR builds rather than merge builds), if we absolutely need published builds to always pull the latest translations. That would remove any complexity here. It's not like the POT/JSON files are a core part of testing the build, and we can still pull them if the changeset includes changes to the translation scripts, so we can verify their functionality.

If caching is specifically something @dhruvkb and @obulat y'all don't think is a good idea, please say so directly and we can close it in favour of https://github.com/WordPress/openverse/issues/4621.

dhruvkb commented 2 months ago

I haven't actually given adequate serious thought to know if caching would cause any problems with VRT or deployments. I'll do that before commenting on that topic.