mozilla / bedrock

Making mozilla.org awesome, one pebble at a time
https://www.mozilla.org
Mozilla Public License 2.0
1.17k stars 912 forks source link

Add `cms_url` helper to provide lang-prefixed relative path to a CMS page from any page #14871

Open stevejalim opened 1 month ago

stevejalim commented 1 month ago

With www.m.o becoming a hybrid of CMS-rendered pages and hard-coded Django views, it would be useful to have a jinja helper that can be used in Django view to link to other CMS pages.

We can already use our regular url() helper to link from CMS pages' templates to Django-based views, so this new helper would be for going the other way - we'll call it cms_url.

The challenge here is finding a sensible way to link to a CMS page. Each CMS page will have a "slug" eg ('products', or 'firefox') but there's nothing that enforces a slug being unique. As such, if cms_url took one single parameter of a single slug, it might return a link to the wrong page (if we let it choose one) or have to fail and return no link - or blow up (and the latter we definitely don't want, given we don't have full control over the slugs in use).

Suggestion 1 - we use the full path of the page, which the developer would have to find/work out

cms_url("/full/cms/path/to/page/") # but without the lang-code prefix, which Bedrock will append accordingly

Upsides: explicit; speciifc Downside: a bit of a chore to work out; a developer might put the lang prefix in there too

Suggestion 2 - go with one or more slugs and return a single match if one is found, else return NOTHING and log an exception, which we'll catch in Sentry and fix

cms_url("some_page_slug") or cms_url("some_page_direct_parent_slug", "some_page_slug"), etc

Upsides: could suit 80% of pages with 20% of the effort Downsides: the failure mode is basically a broken link that we'd have to race to fix

stevejalim commented 1 month ago

@alexgibson What do you think of the two suggestions in the description for how to handle this? Am def open to other ideas, too

alexgibson commented 1 month ago

I'm not sure suggestion 2 makes much sense from a front-end point of view. When we create a link, we want to to be to a specific destination, and not one of multiple possible options?

I would say that suggestion 1 sounds much more predictable. But what happens when we add a link to cms_url("/about/leadership/") for example, but then someone changes the slug in the CMS? It seems unlikely that people would do this with a published URL (unless we're restructuring things intentionally at some point), but it does feel a bit cat and mouse.

Would another option be to add a unique page ID to all page models perhaps, and then cms_url('leadership_page') could then use that as a lookup table of sorts to find the right slug?

Edit: i guess if the above 2 options are the only practical choice, then suggestion 1 would be the more preferable.

stevejalim commented 1 month ago

re: parameters to the argument

I get your point @alexgibson, yeah. We do already have a unique ID for each Page - the URL has the primary key / ID in it.

So, I wondered maybe whether we could use that in the tag (cms_url(23)), but the probem there is that each localised version of a page has a different ID (e.g. en-US of a page might be 23, then the fr version of it might be 24, etc) - and by default that would return the relevant lang code in the url path. We could work around that like this:

cms_url(24, preserve_locale=False) would look up a path for the page with the ID 23 but then drop the lang-code, letting Bedrock pick the most appropriate language for the user. This would mean that if someone links to any localised version of a particular page, it doesn't matter because bedrock will still adjust the locale to suit the user.

Meanwhile cms_url(24, preserve_locale=True) would keep the lang code, allowing us to deep link to that specific localised version of a page.

stevejalim commented 1 month ago

re: broken links

To paraphrase a point you made above:

But what happens when we add a link .... but then someone changes the slug in the CMS?

This is definitely a pain and a real risk, given the appetite for teams to self-manage the content they are allowed to. We can't write unit tests, because those tests won't necessarily have the pages in the DB when they're run.

We could though, try some kind of headless test that runs when playwright does (or even part of playwright) where we scan the codebase for cms_url() tags and works out where it should link to, etc. However, this feels quite messy.

We could instead upgrade www-site-checker to report a list of 404s found while scanning the site twice-daily, which would generally help us spot any broken links regardless of whether they are CMS related or not. I think that's a better fit.

@alexgibson What do you think?

alexgibson commented 1 month ago

cms_url(24, preserve_locale=False) would look up a path for the page with the ID 23 but then drop the lang-code, letting Bedrock pick the most appropriate language for the user. This would mean that if someone links to any localised version of a particular page, it doesn't matter because bedrock will still adjust the locale to suit the user.

Meanwhile cms_url(24, preserve_locale=True) would keep the lang code, allowing us to deep link to that specific localised version of a page.

I think there's something workable here yeah, although 23, 24 etc is not the most human readable page ID. Is that just an example, or are the actual IDs just incremental numbers?

We could instead upgrade www-site-checker to report a list of 404s found while scanning the site twice-daily, which would generally help us spot any broken links regardless of whether they are CMS related or not. I think that's a better fit.

This sounds like a good idea in general +1

stevejalim commented 1 month ago

I think there's something workable here yeah, although 23, 24 etc is not the most human readable page ID. Is that just an example, or are the actual IDs just incremental numbers?

Yeah, they're literally incremented ids, visible in the URL when editing pages. The rendered site wouldn't leak these IDs, but even if it did, they're all public pages anyway

alexgibson commented 1 month ago

Yeah, they're literally incremented ids, visible in the URL when editing pages. The rendered site wouldn't leak these IDs, but even if it did, they're all public pages anyway

Hmm, I gotta say I think that these IDs would be cumbersome to try and maintain in templates like this. I'd almost prefer the slug option over this if there's nothing more readable we could wrangle :/

alexgibson commented 1 month ago

@stevejalim I wonder, is there an option where we can make slugs read-only for core pages? (e.g. about page, leadership page etc). It feel like things would be more maintainable/predictable through code that way, and then open ended hub pages (like VPN resource center) could have editable slugs as those pages are very unlikely to move parent.

stevejalim commented 1 month ago

We could (surely) support locking slugs yeah, either via config (like we do re adding certain pages types) or via workflow rules

alexgibson commented 1 month ago

We could (surely) support locking slugs yeah, either via config (like we do re adding certain pages types) or via workflow rules

In that case, I'd be tempted to explore the "slug" route (suggestion 1) for cms_url(), as well as the ability to solidify slugs in code for key pages that we don't want moving around unexpectedly. Think that sounds reasonable?

stevejalim commented 1 month ago

Looking at making slugs editable would be best done by:

1) moving all 'sensitive' fields to a custom editing tab - normally slug lives in Promote, but we could move it to new Administration tab 2) add administrators permission group 3) Configure the Administration tab to only be editable by administrators - see handy SO question

So that'd work....

However, it still doesn't solve our problem of either single slugs needing to be unique to be looked up, or for developers to add a handful of slugs to make it absolutely clear which updates page they mean (eg firefox, updates or otherproduct, updates)

So, just wondering again about your idea of a custom identifier for each page -- one that strictly isn't unique, but IS unique across all localized versions of that page... So when a page is first saved, we generate it, and that gets copied to all translated versions of the page. So then we'd have cms_url('leadership_page') which would return a link in the user's most approriate language, or cms_url('leadership_page', lang_code='de') if we had to force a specific lang code

How's that sound?

alexgibson commented 1 month ago

So, just wondering again about your idea of a custom identifier for each page -- one that strictly isn't unique, but IS unique across all localized versions of that page... So when a page is first saved, we generate it, and that gets copied to all translated versions of the page. So then we'd have cms_url('leadership_page') which would return a link in the user's most approriate language, or cms_url('leadership_page', lang_code='de') if we had to force a specific lang code

This sounds both manageable and readable to me, yeah 👍