docsifyjs / docsify

πŸƒ A magical documentation site generator.
https://docsify.js.org
MIT License
27.76k stars 5.68k forks source link

Improved SEO (meta tags support) #1235

Open dialex opened 4 years ago

dialex commented 4 years ago

Feature request

What problem does this feature solve?

It improves SEO of pages generated by Docsify. It makes it easier for search engines to find the relevant content, and when a user shares a link to a Docsify page, the rendered preview is more accurate (title, description and image).

Example (diff pages, same SEO 😒 ):

Screenshot 2020-06-20 at 22 12 05

What does the proposed API look like?

Marp allows you to write slides using Markdown, and each file has a small header like this:

---
marp: true
title: Marp CLI example
description: Hosting Marp slide deck on the web
theme: uncover
paginate: true
---

Docsify could have something similar that would allow us to define the meta tags of each page. Example:

---
<!-- Primary Meta Tags -->
meta-title: Site title: Page title
meta-description: This is the summary that appears on search engine results or preview links
<!-- Open Graph -->
meta-og-type: 
meta-og-url: 
meta-og-title: 
meta-og-description: 
meta-og-image: https://yourdomain.com/path/to/image.jpg
<!-- Twitter -->
meta-twitter-card: 
meta-twitter-url: 
meta-twitter-title: 
meta-twitter-description: 
meta-twitter-image: https://yourdomain.com/path/to/image.jpg
---

# Page title

And the _rest_ of the page.

How should this be implemented in your opinion?

I checked if it was possible to hack it with the current Docsify version by hardcoding HTML tags in the .md file.

<!-- markdownlint-disable MD033 -->
<!-- SEO tags -->
<title>(TITLE) (SEPARATOR) Title of this page</title>
<meta name="title" content="(TITLE) (SEPARATOR) Title of this page">
<meta name="description" content="A description that is specific to this page.">

# TITLE_HERE

failed attempt

It didn't work. Those tags were added inside <body> ... <article>, but we need them to exist in the <head> section.

I propose this flow:

Example:

Docsify detects a header. One key is meta-dummy. Docsify doesn't know how to handle this key, so it skips it. The next key is meta-title. Docsify knows how to handle this, so it goes to head.meta.title and replaces the default value with the value in the file header.

Are you willing to work on this yourself?

I don't think I have enough JS and Docsify knowledge to develop this feature. But I can (beta) test it!

anikethsaha commented 4 years ago

Interesting thought. Thanks for the detailed issue.

yeah we can do this, though I would implement this as this

$docsify= {
 ...
 meta: [
   {type: '',  content: '' }
]
 ...
}

these will create new meta tags and append it at the end of the head or something like

$docsify= {
 ...
 meta: `<meta  name="" content="" >
   <meta  name="" content="" >
`
 ...
}

and this will simply append at the end.

would love to hear others suggestions.

dialex commented 4 years ago

And would be able to add that $docsify= { ...} snippet on .md files, and thus customise it per page? Or does that apply only to the index.html file, where we declare and configure Docsify?

I would prefer the former :)

anikethsaha commented 4 years ago

umm, yea I missed this that would be implied to an only single page,

We need some discussion about the approach or perhaps a PoC,

cc @docsifyjs/core

sy-records commented 4 years ago

should be supported by every md file, index.html gives a default configuration, each file can be customized separately.

anikethsaha commented 4 years ago

We need to read the meta data from the markdown file then.

---
meta1: 
meta2:
---

but this has an issue, our markdown parser I guess doesn't explicitly tells that whether they are markdown metadata or not. also, this needs to be sent to the template generation method and I think the template is generated first, and then the markdown is being compiled, (not sure about this. ) but if this is the case then we need to do some tweak in the code.

sy-records commented 4 years ago

Yes, we may need to use a regular to match, similar to the emoji matching :100:.

See what others think.

Koooooo-7 commented 4 years ago

I think we need make docsify configurations in content have the specific prefix or symbol , such as {docsify- xxx} to make those features. If we still use the regex or something to match some simple characters to get targets, it would be conflict with some syntax one day.it is unpredictable.

---
meta1: 
meta2:
---

this way seems incompatible with the yaml font matter (#1129) for docsify. BTW, Jekyll is in this way for SEO. and its plugin jekyll-seo-tag made it in independent file.

trusktr commented 4 years ago

Does changing the meta tags at runtime (each time we change pages) have any effect on the browser? If not, then thus is good only with SSR (or the upcoming static site generation).

Basic spiders/crawlers only read meta tags on the initial page load. If we modify them later, it has no effect.

Do we need this at runtime? Might be good for anyone with SSR though.

jhildenbiddle commented 4 years ago

Does changing the meta tags at runtime (each time we change pages) have any effect on the browser?

I think this gets into "how does Google's search algorithm work?" territory. I'm not enough of an SEO expert to know which tool(s) we could leverage to test client-side generated <meta> tags, but I assume they exist.

Maybe Google Structured Data can help? It looks like it can be generated with JavaScript and we can use the Rich Results Test tool to test our implementation. The downside to this approach (assuming it works as we expect) is that this is Google-specific. Ideally, we'd address SEO issues for all search engines, not just Google.

sy-records commented 4 years ago

This is suitable for doing when generating static html in v5.

trusktr commented 4 years ago

We don't need new syntax for this. Just <meta> tags. If the SSR or static generator sees them, it simply takes them and adds them to the <head>.

If we want new syntax, it can be an additional feature for later, after this one.

trusktr commented 4 years ago

Ideally, we'd address SEO issues for all search engines, not just Google.

The only way we can be 100% sure we adress all search engines is having this functionality for SSR and static site generation and encouraging people to use those instead of the live markdown rendering.

We can also just add/remove the meta tags at runtime which is simple (and assume some crawler might be smart enough to trigger route changes and read the dynamically-updated meta tags), but I don't think we can make any guarantees on whether that's actually useful (I've never heard of that being useful, but who knows).

If the dynamic generation is part of the normal markdown processing, then it should just work with SSR and static generation too.

Maybe Google Structured Data can help?

I think that's a totally different feature, that we can tackle apart from (or before/after) this one, if we wish to.

trusktr commented 4 years ago

suitable for doing when generating static html in v5.

I think static generation can be a 4.x non-breaking feature, if we get to it before we get to v5.

jhildenbiddle commented 4 years ago

We're talking about different things.

Q: Can we improve SEO for docsify sites? A1: For server side rendering? Yes, using <meta> tags, possibly sooner than later. A2: For static site generation? Yes, using <meta> tags, but not until we have SSG. A3: For client-side rendering? Possibly. We aren't sure if/how <meta> tags created and injected on the client affect SEO.

Google Structured Data was offered as a way to address the uncertainty of A3 and address @trusktr's "Does changing the meta tags at runtime (each time we change pages) have any effect on the browser?" question. Google's documentation states specifically that this SEO-related information can be generated on the client, which would allow us to improve SEO for client-side only docsify sites. It's not about a new syntax; it's about improving SEO for the sites that need it most (client-side rendering) that also (I assume) make up the vast majority of docsify users. For the record, I'm not proposing we should go this route, only that we could explore it if client-side <meta> tags won't work.

After a little digging, it looks like client-side generated <meta> tags might actually work (which would be awesome). I think I can do some tests using a custom plugin to mimic client-side YAML-to-Meta conversion, then check the results using Google's URL Inspection Tool to see how Google indexes the page. If we can inject <meta> tags on the client instead of using Google Structured Data, that would be great (and preferreable).

trusktr commented 4 years ago

Q: Can we improve SEA for docsify sites? A1: For server side rendering? Yes, using tags, possibly sooner than later. A2: For static site generation? Yes, using tags, but not until we have SSG. A3: For client-side rendering? Possibly. We aren't sure if/how tags created and injected on the client affect SEO.

That's a good summary.

Note, SSR works already in my PR with a simple fix, but I'm trying to also add tests for it.

In all cases, the markdown output will have meta tags. In case of A3, a post render step moves the meta tags to the head with DOM methods. In cases of A1 and A2, a post render step moves the meta tags with string methods.

Google's documentation states specifically that this SEO-related information can be generated on the client

Is there any limitation? For example, does it need to be generated with document.write or similar, so that the crawler will see it as the page loads? Or does it provide for a way to signal to Google when it is ready?

If there is some way for Google to wait for the data to be ready, maybe we can re-write meta tags into Google Structured Data so the Docsify user can just use the cross-browser tags?

jhildenbiddle commented 4 years ago

Is there any limitation? For example, does it need to be generated with document.write or similar, so that the crawler will see it as the page loads? Or does it provide for a way to signal to Google when it is ready?

Dunno. I own about 10 minutes worth of knowledge on Google Structured Data. :)

If there is some way for Google to wait for the data to be ready, maybe we can re-write meta tags into Google Structured Data so the Docsify user can just use the cross-browser tags?

That's what I was thinking with the original suggestion. If we have page-specific meta data provided via YAML, there's nothing preventing us from generating <meta> tags for SSR and SSG scenarios but Google Structured Data for client-side rendered sites. Then again, from what I read it sounds like dynamic <meta> tags might work, too. Either way, we've got some work to do.

One important note: apparently the time of injection matters, so waiting until after a request for content has completed so we can get meta information from each markdown file may not work. Although I don't love this idea, injecting <meta> tags client-side may only work if meta data is stored in JavaScript, loaded in index.html, and injected on page change before asynchronous functions like content requests.

ghost commented 4 years ago

made a jekyll theme from docsify https://rundocs.github.io/jekyll-theme-docsify/

SEO supported still need your help!

jcayzac commented 3 years ago

You can trivially inject meta like this:

index.html

  <title>Foo</title>
+ <meta name="description" content>
+ <meta name="keywords" content>

SEO plugin

'use strict';

/*
 * Update the page info based on frontmatter data or on the configurated
 * generator.
 */
(function (window) {
  function meta(name, content) {
    document.querySelector(`meta[name="${name}"]`).content = content || ''
  }

  function plugin(hook, vm) {
    const refreshInfo = () => {
      const { config, route, frontmatter } = vm
      const { description, keywords } = frontmatter || {}
      const entries = { description, keywords }

      for (const key in entries) {
        var value = entries[key]

        if (value === undefined) {
          const defaultValue = config.seo?.[key]
          if (typeof defaultValue === 'function') value = defaultValue(route, frontmatter || {})
          else value = defaultValue
        }

        meta(key, value)
      }
    }

    hook.init(refreshInfo)
    hook.doneEach(refreshInfo)
  }

  window.$docsify = window.$docsify || {}
  window.$docsify.plugins = window.$docsify.plugins || []
  window.$docsify.plugins.push(plugin)
})(this)

Config

Optional: For each meta, provide either a string or a function that returns the value.

 window.$docsify = Object.assign(window.$docsify || {}, {
+  seo: {
+    description: (route, frontmatter) => {
+      if (route.path.startsWith('/ja/')) return `δΈ€θˆ¬ηš„γͺθͺ¬ζ˜Ž`
+      else return `Some default description`
+    },
+    keywords: `foo, bar, baz`,
+  },
   pagination: {
dialex commented 2 years ago

Hey! Is there any update? Any blocker preventing this feature from being part of Docsify?

jhildenbiddle commented 2 years ago

Hey @dialex --

I believe there are two reasons why this hasn't been worked on:

  1. This falls into the larger "client-side SEO" bucket of work that we've discussed previously but nobody has taken the reins on. One reason for this is because...
  2. This is not an easy thing to test because it requires waiting for search engines like Google and Bing to crawl the website and (maybe?) detect the changes dynamically generated on the client. Google and Bing are a black box to anyone outside of those companies, so there's no way of knowing for certain when or if things will work.

Injecting <meta> tags on page loads is the easy part and is trivial to implement via a plugin. The bigger question is how search engines will respond to this. Years ago, there was no chance that they would detect these types of client-side changes. Today, search engines are smarter about crawling sites rendered on the client but how they do it is uncertain and continues to change over time. This is why most people prefer static websites when SEO is critical.

I created a quick "proof-of-concept" to demonstrate how this might be implemented via a plugin:

The easiest way to see what the plugin is doing is by using the browser's dev tools to see the updates happening the <head> element as you navigate between pages. Message are also sent to the console as <meta> tags are appended and updated.

When I say "proof of concept", I mean it. The POC doesn't handle a number of common meta tag use cases such as multiple attributes on a single tag and intelligently removing meta tags injected from previous pages. Those things could be added easily enough as well though.