gohugoio / hugo

The world’s fastest framework for building websites.
https://gohugo.io
Apache License 2.0
73.62k stars 7.39k forks source link

pagination: Include pager number in pager's Permalink/RelPermalink values #4507

Open MunifTanjim opened 6 years ago

MunifTanjim commented 6 years ago

Example:

In the https://example.com/blog/page/2/ page, the .Permalink is set to https://example.com/blog/.

Shouldn't .Permalink be set to https://example.com/blog/page/2?

Or is this by design?

Use case:

Quoted from Pagination & SEO: best practices:

Google is very clear now: each page within a paginated series should canonicalize to itself, so /page/2/ has a canonical pointing to /page/2/

If the .Permalink is set to https://example.com/blog/page/:number on paginated list templates, using the following code for including link(rel=canonical) tag to the HTML head would follow the best practice by default.

<link rel='canonical' href='{{ .Permalink }}'>

Otherwise, with the current .Permalink implementation, we have to come up with hacks that include checking if it's a list template and then extract the page number from the paginator pages.

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. The resources of the Hugo team are limited, and so we are asking for your help. If this is a bug and you can still reproduce this error on the master branch, please reply with all of the information you have about it in order to keep the issue open. If this is a feature request, and you feel that it is still relevant and valuable, please tell us why. This issue will automatically be closed in the near future if no further activity occurs. Thank you for all your contributions.

MunifTanjim commented 5 years ago

Still applicable for Hugo v0.44

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. The resources of the Hugo team are limited, and so we are asking for your help. If this is a bug and you can still reproduce this error on the master branch, please reply with all of the information you have about it in order to keep the issue open. If this is a feature request, and you feel that it is still relevant and valuable, please tell us why. This issue will automatically be closed in the near future if no further activity occurs. Thank you for all your contributions.

MunifTanjim commented 5 years ago

Still applicable for Hugo v0.51

krispkrisp commented 3 years ago

This is very important issue for larger website, since wrong canonical on pagination pages prevents from proper crawling and indexing, which leads to lower organic traffic.

somethingSTRANGE commented 2 years ago

I was looking for a way to access the permalink for paginated pages when I came across this issue. It would certainly be great to have a .Permalink that matched the generated page URL.

There was a useful article that helped with manually constructing paginated permalinks, but it only worked on a site's homepage and it failed if uglyURLs and/or paginatePath were overridden in the site's config.

The following may be useful to others looking to access a permalink that should mirror the paginated page's URL. I've tested it on paginated taxonomy lists, post/article lists, the home page, and elsewhere, and it seems to always produce the correct permalink.

If you're not using ugly URLs and a modified paginate path, you can use the following:

:bulb: You only need one reference to .Paginate per template. If you're already using one in your template, you can update the $paginator := .Paginate ... line below to reuse it.

{{ $paginator := .Paginate (where .Site.RegularPages ".Params.post" "!=" false) }}

{{ $permalink := .Permalink }}
{{ with $paginator }}
    {{ if and (or $.IsHome $.IsNode) (ne .PageNumber 1) }}
        {{ $permalink = print $permalink "page/" .PageNumber "/" }}
    {{ end }}
{{ end }}

When uglyURLs and/or paginatePath are overridden, it gets a bit more complex, but it's still pretty straightforward.

:exclamation: It doesn't look like the config settings uglyURLs and paginatePath can be accessed in templates. If you've overridden them there, you'll want to duplicate those in your params file, so that you can access them in templates.

{{ $uglyURLs := $.Param "uglyURLs" | default false }}
{{ $paginatePath := $.Param "paginatePath" | default "page" }}
{{ $paginator := .Paginate (where .Site.RegularPages ".Params.post" "!=" false) }}

{{ $permalink := .Permalink }}
{{ with $paginator }}
    {{ if and (or $.IsHome $.IsNode) (ne .PageNumber 1) }}
        {{ $permalink = print $permalink $paginatePath "/" .PageNumber (cond $uglyURLs ".html" "/") }}
    {{ end }}
{{ end }}
jmooring commented 2 years ago

From Google's documentation, last updated 2021-11-22, emphasis in original:

Don't use the first page of a paginated sequence as the canonical page. Instead, give each page in its own canonical URL.

arif254 commented 2 years ago

@bep web.dev has started to issue a warning for this and penalizing in SEO score. (None of the suggested workarounds I found here and online works).

Document does not have a valid `rel=canonical`

Points to the domain's root URL (the homepage), instead of an equivalent page of content
tyytytytyiigo commented 1 year ago

This issue has had me scratching my head for days! This, coupled with the fact that Hugo caches the first instance of Paginator or Paginate has really tested my limits. Any plan in store to solve this?

soul-ride commented 7 months ago

Hopefully this will be resolved in nearest future. According to all SEO guides I found, pagination pages should have self-referring canonicals nowadays.

While there are several articles on workarounds in hugo forum, they are quite hard to implement in some situations.

I ended up disabling canonical header in my theme and using JS for this, as described here - https://developers.google.com/search/docs/crawling-indexing/javascript/javascript-seo-basics#properly-inject-canonical-links

It's also possible to use HTTP headers which maybe even better than JS - https://developers.google.com/search/docs/crawling-indexing/consolidate-duplicate-urls#rel-canonical-header-method

wu0407 commented 6 months ago

It also affects check homepage https://discourse.gohugo.io/t/how-to-check-if-current-page-is-homepage/23301

Current solution is pagination in baseof.html https://github.com/jmooring/hugo-testing/blob/hugo-forum-topic-37643/layouts/_default/baseof.html https://discourse.gohugo.io/t/control-pagination-and-page-collections-from-baseof-html/37643/8 https://discourse.gohugo.io/t/determine-if-current-page-is-result-of-pagination/37494/4

wu0407 commented 6 months ago

It also affects the tag og:url https://github.com/gohugoio/hugo/blob/6f13430d4a3b0d8b196f13958fbfb6478be1f3aa/tpl/tplimpl/embedded/templates/opengraph.html#L4

skrysmanski commented 4 months ago

Just for reference, I used to use this to determine the permalink:

{{- strings.TrimSuffix "/" site.BaseURL -}}{{- .Paginator.URL -}}

Unfortunately, for me, this has unintended side effects like resetting the sorting on the paginator and creating paginated 404 pages. So it's not a real solution. (I am still looking for a solution.)

tyler-copilot commented 3 months ago

This is still an issue, and it needs to be fixed.

robrich commented 3 months ago

I found a descent work-around in https://github.com/calintat/minimal/pull/122/files

saikadaramakaisosjupita commented 3 months ago

Here's the best solution I found in the discourse forum...

    {{- $canonicalURL := .Permalink -}}
    {{- with $paginator -}}
      {{- if gt $paginator.PageNumber 1 }}
        {{- $canonicalURL = .URL | absLangURL -}}
      {{- end }}
    {{- end -}}

But you must ensure that the $paginator value is the same as the one in your templates (home or list) because Hugo caches the first instance of .Paginator or .Paginate. if else can help here too.

jmooring commented 2 months ago

Simple example:

git clone --single-branch -b hugo-github-issue-4507 https://github.com/jmooring/hugo-testing hugo-github-issue-4507
cd hugo-github-issue-4507
hugo server

Although .Paginator.URL gives us the value we want, we can't use this at the top of a template (e.g., baseof.html) because it invokes pagination and the result is cached. The only workaround that I know of isn't pretty:

https://discourse.gohugo.io/t/control-pagination-and-page-collections-from-baseof-html/37643/8

In the above, you have to control all pagination from the top of your baseof.html template. That way you can access the .Paginator values later on.

andreashaerter commented 1 month ago

@somethingSTRANGE This won't work in every case as https://github.com/gohugoio/hugo/issues/2449 introduced multilang support for site.paginatePath (so setting it in a language file is possible):

{{ $uglyURLs := $.Param "uglyURLs" | default false }}
{{ $paginatePath := $.Param "paginatePath" | default "page" }}
{{ $paginator := .Paginate (where .Site.RegularPages ".Params.post" "!=" false) }}

{{ $permalink := .Permalink }}
{{ with $paginator }}
    {{ if and (or $.IsHome $.IsNode) (ne .PageNumber 1) }}
        {{ $permalink = print $permalink $paginatePath "/" .PageNumber (cond $uglyURLs ".html" "/") }}
    {{ end }}
{{ end }}

I therefore quickly hacked a partial which uses regular expressions to get the paginatePath:

{{ $canonicalUrl := "" }}

[...]
  {{ $canonicalUrl = .Permalink }}
  {{ if and .IsNode .Paginator }}
    {{ if gt .Paginator.PageNumber 1 }}
      {{ $paginatePath := (replaceRE `^.+/(.+)/\d(/|.html)$` "$1" .Paginator.URL) }}
      {{ $urlEnding := (replaceRE `^.+/(.+)/\d(/|.html)$` "$2" .Paginator.URL) }}
      {{ if or (and (ne $urlEnding "/") (ne $urlEnding ".html"))
               (eq $paginatePath "")
               (eq $paginatePath .Paginator.URL)
               (eq $paginatePath $canonicalUrl) }}
        {{ errorf "[theme] function/getCanonicalUrl.html: invalid detection of paginatePath (result: %q) or URL ending (result: %q). Please check your Hugo config (paginatePath and/or uglyURLs)." $paginatePath $urlEnding }}
      {{ end }}
      {{ $canonicalUrl = (printf "%s%s/%d%s" $canonicalUrl $paginatePath .Paginator.PageNumber $urlEnding) }}
    {{ end }}
  {{ end }}

[...]

{{ return $canonicalUrl }}

Not really nice, but works...

jmooring commented 2 weeks ago

This is a much simpler approach: https://discourse.gohugo.io/t/pagination-using-pager-urls-within-link-elements/50340