mdn / yari

The platform code behind MDN Web Docs
Mozilla Public License 2.0
1.19k stars 509 forks source link

Slug should *just* be the name #973

Open peterbe opened 4 years ago

peterbe commented 4 years ago

At the moment, in the front-matter we have that the slug is a whole path. For example Web/HTML/Element/video when it just needs to, for example:

slug: video

One disadvantage with having the whole path in there is that if you need to rename a slug, you now need to git edit every single index.html that exists below. Another disadvantage is that it's a foot-gun. You might accidentally edit from slug: Web/HTML/Element/video to slug: We/HTML/Element/video without really noticing.

Lets make the importer just put slug: ${basename(document.slug)) in there. And in the Yari code, we can no longer make the url be ${locale}/docs/${slug} but it needs to become based on looping over the path and doing a Document.findByUrl(...) for each forward-slash.

Gregoor commented 4 years ago

I'm all for removing redundancy in our core source of truth, for the reasons you have given. findByURL wouldn't have to change fortunately, so retrieval is just as cheap. Constructing the URL on the other hand (which Document.read also does), would involve reading all the ancestors, or I guess reading the first one's URL and letting things recurse.

I'm also still wondering if we could fully move the truth to the FS. I.e. can we make slugToFoldername a two-way function?

peterbe commented 4 years ago

I'm also still wondering if we could fully move the truth to the FS. I.e. can we make slugToFoldername a two-way function?

The complication with that is that the cAsE would be lost. Yes, you can figure out that function_star_ should become function* when it's presented in sitemap.xml etc. But how do you deduce that generatorfunction should become GeneratorFunction

It might be as simple as this: If you can't figure out the slug based on the foldername, then you have to have a slug: ... in the front-matter.

In fact, that's how the UI for displaying flaws in the toolbar works. This way you get the best of both words. You don't break any existing URLs and you have minimal typing of things that are automatically implied from the file system.

peterbe commented 3 years ago

Chatted to @fiji-flo about this and we concluded that A) it's a good idea B) shame we didn't get it done earlier before the final migration.

We should go ahead and do this. It's useful not just for any unslug/unfreeze effort but it'll help the regular main en-US work too. It'll have to be a two-part change. 1) the Yari code 2) mass-edit on the content. What makes it tricky is that there are actually 4 repos (mdn/content, mdn/translated-content, mdn/translated-content-rendered, mdn/archived-content) so the timing has to either be perfect or we write the Yari code in a way that it's easier to do it incrementally/partially.

ghost commented 3 years ago

But what exactly is the slug for?

peterbe commented 3 years ago

But what exactly is the slug for?

The URL. For example, in https://developer.mozilla.org/de/docs/Web/JavaScript the slug is Web/JavaScript.

ghost commented 3 years ago

But what exactly is the slug for?

The URL. For example, in https://developer.mozilla.org/de/docs/Web/JavaScript the slug is Web/JavaScript.

Ok but I was wondering, is the folder in the repository not enough?

peterbe commented 3 years ago

But what exactly is the slug for?

The URL. For example, in https://developer.mozilla.org/de/docs/Web/JavaScript the slug is Web/JavaScript.

Ok but I was wondering, is the folder in the repository not enough?

The folders aren't always the same. Sometimes the casing is different. Sometimes the characters used are different.

ghost commented 3 years ago

Ok. Anyway i agree with You and this issue of let the slug be the name. 👍

escattone commented 3 years ago

@peterbe @fiji-flo I like this, and I think it's the way forward. In other words, let's make the slug be the properly-cased name (e.g., the slug for /en-US/docs/Web/JavaScript becomes just JavaScript). @peterbe and I discussed that perhaps the slug is only required if the casing is different from the folder name, so for example, /en-US/docs/Web/HTML/Element/video wouldn't need a slug (since video is the same as the folder name), but /en-US/docs/Web/HTML/Element would (since Element is different from the element folder name).

We've all discussed the option of dropping the slug altogether, making all of the URL's lowercase, and ensuring a one-to-one mapping between folders and URL's, but after speaking with one of our SEO experts at Mozilla, I felt like it was too much (too risky in SEO terms) right now. We'd have to ensure that our canonical URL's are all lowercase, and also redirect from the old URL's to the new lower-cased ones (e.g., /en-US/docs/Web/JavaScript --> /en-us/docs/web/javascript) or else we'd face SEO consequences.

ghost commented 3 years ago

https://seositecheckup.com/seo-audit/developer.mozilla.org/en-US/docs/Web/HTML/Element/video

Yes URL Canonicalization Test fails ❌

Ryuno-Ki commented 3 years ago

Okay, so what would be needed in code and content next?

peterbe commented 3 years ago

My recommendation is that we start gently. You start by writing a piece of code where we're currently extracting the slug from the front-matter and make it smart. Something like this:

-const slug = metadata.slug;
+let slug = metadata.slug || getThisFolderName()
+if (!slug.contains('/')) {
+   // It's one of those fancy new ones that just contain a single word
+   slug = `${getParentSlug()}/${slug}`;
+}

That way, at your convenience you can pick an index.html in mdn/content and either remove the slug or if it needs casing you rename it...

-slug: Web/HTML/Element
+slug: Element