ampproject / amphtml

The AMP web component framework.
https://amp.dev
Apache License 2.0
14.89k stars 3.89k forks source link

Template tags parsed by search crawlers causing 404 links etc. #23896

Closed radzhome closed 3 years ago

radzhome commented 5 years ago

Please only file reports about bugs in AMP here.

If you have a bug for AMP please fill in the following template. Delete everything except the headers (including this text).

What's the issue?

If you have a link coming from the template in your code, such as this (see {{ get_amp_long_url }}):

<template type="amp-mustache">
                            <div class="article-page-list-item">
                                <div class="article-page-list-item-top">
                                    <div class="article-page-list-item-top-left">
                                        <!-- List category -->
                                        {{ #category.name }}<p class="article-category">{{{ category.name }}}</p>{{ /category.name }}

                                        <!-- List title -->
                                        <a class="no-underline article-text-a" href="{{ get_amp_long_url }}" title="Link to {{ title }}">
                                            <amp-fit-text class="article-title" width="374" height="117" layout="responsive" min-font-size="18" max-font-size="18"><h3>{{ title }}</h3></amp-fit-text>
                                        </a>
                                    </div>

                                    {{ #image_url }}<a class="no-underline article-page-list-item-top-right article-a-image" href="{{ get_amp_long_url }}" title="Link to {{ title }}">
                                        <amp-img class="article-image" alt="" src="{{ image_url }}?quality=90&strip=all&w=150" width="{{ #image_width }}{{ image_width }}{{ /image_width }}{{ ^image_width }}1.33{{ /image_width }}" height="{{ #image_height }}{{ image_height }}{{ /image_height }}{{ ^image_height }}1{{ /image_height }}" layout="responsive"></amp-img>
                                    </a>{{ /image_url }}
                                    {{ ^image_url }}{{ /image_url }}
                                </div>

                                <div class="article-page-list-item-bottom">
                                    <a class="no-underline article-text-a" href="{{ get_amp_long_url }}" title="Link to {{ title }}">
                                        {{ #author }}
                                        <amp-fit-text class="published-by" width="374" height="40" layout="responsive" min-font-size="18" max-font-size="18"><p>By {{ author }}</p></amp-fit-text>
                                        {{ /author }}

                                        <p class="published-date"><span class="published-time">{{ published_since }}</span><!--<span class="published-dot"> &middot; </span><span class="read-time">1 minute read</span>  To-Do - Add word count to article lists --></p>
                                    </a>
                                </div>
                            </div>
                        </template>

The rendered url is all good but the template with the raw code still stays in the DOM, see here:

The <template type="amp-mustache"> is not displayed but still in there and crawlable by bots. So that said, google bots try to index it as:

https://beta.canada.com/travel/international-travel/a-taste-of-australias-hunter-valley/wcm/a540a92b-95df-41b0-9f3d-e87a78b2f5d8/amp/{{ get_amp_long_url }}

If you put a rel="nofollow" on the links within the template, the rendered links will also not be indexable.

How do we reproduce the issue?

Please provide a public URL and ideally a reduced test case (e.g. on jsbin.com) that exhibits only your issue and nothing else. Provide step-by-step instructions for reproducing the issue:

  1. Withink your template add a link
  2. View source of your mustache amp list and find the