WP: Why do AMP frequently fail to load images in Safari?

tunetheweb commented 4 years ago

What's the issue?

AMP pages often have missing images or other content in iOS.

How do we reproduce the issue?

Buy someone on the team an iPhone and let them use it as their primary device. They’ll soon spot broken pages.

Here’s the latest broken page which has caught my ire and finally made me come here to post this.

https://rigor.com/blog/web-performance-bookshelf/amp?__twitter_impression=true - AMP validation passes but none of the main content loads (initial paragraph loads but scroll down and the actual books don’t load).

And here’s a handful more just by doing a Google search for “AMPhtml” and checking out the lightning bolt results to see which are broken:

https://www.ampproject.in/docs/ads/a4a_spec.html (AMP validation passes but large white space just above Boilerplate heading).
https://amp.dev/documentation/guides-and-tutorials/learn/intro-to-amphtml-ads/ - AMP validation passes but gap above menu bar when opening from Google SERPS and scrolling down the page as the “real URL” bar vanished and page menu bar doesn’t move up to fill the gap.
https://www.google.ie/amp/s/www.fenj.nl/seo/wat-is-amp-html/amp/ - Passes AMP Validation but headline is all broken - maybe due to a bad font?
https://www.google.ie/amp/s/resultadosdigitais.com.br/blog/o-que-e-google-amp-html/amp - Passes AMP Validation but scrolling up and down leads to long pauses of white space - in an iPhone XS (so a pretty decent phone!) on Home WIFI (so pretty decent connection).
https://www.google.ie/amp/s/swapps.com/blog/what-its-amp-and-why-you-should-use-it/amp/ - Passes AMP validation but headline is unreadable - black on dark blue. Real page has white font.

Will add more as I spot them.

I suspect in a lot of cases they are poorly coded sites. But then why do they pass AMP validation? And if AMP is so brittle and HTML is so robust (all of the above sites load fine with non-AMP versions) then why is AMP better?

What browsers are affected?

Safari - latest IOS 13.3.1 and every single version of AMP on Safari on iOS since AMP was launched has had these issues.

Which AMP version is affected?

Every single version of AMP on Safari on iOS since AMP was launched has had these issues.

AMP is a terrible experience in iOS. I presume it’s better in Chrome but maybe it’s the same there? If not can you browser detect iOS and automatically redirect those users to the real site? I for one would massively appreciate that as clicking the little bar, or opening in Safari from in app browser (e.g. Twitter) and then editing the URL to guess the non-AMP URL is massively frustrating and ruining the web for me.

tunetheweb commented 4 years ago

For reference a link to the last time I complained about this three years ago: https://github.com/ampproject/amphtml/issues/10267

I had hopes it would get better after that fix landed but alas it did not.

wassgha commented 4 years ago

On the first link I'm seeing

:root:not(#_):not(#_):not(#_):not(#_):not(#_) .amp-wp-68d3e6a {
    opacity: 0;
}

which seems to be why the content is hidden. Tagging @westonruter if that one is a wordpress related bug

westonruter commented 4 years ago

Hello! I can see the same exact styling on the non-AMP version: https://rigor.com/blog/web-performance-bookshelf/

Importantly, the opacity:0 gets replaced with opacity: 1 by JavaScript once the page has loaded:

So JavaScript is currently required for the table to display.

Since custom JavaScript is not allowed in AMP (in this way), the AMP plugin is stripping it out to make the page valid AMP.

However, the issue is not limited to AMP pages. Any user has JS turned off in the browser will also get the same broken experience. My first screenshot above shows what the page looks like with JS turned off. If the table is made to work properly when JS is turned off, then this will by default become the AMP-experience since custom scripts are removed.

So this appears to be working exactly as expected.

tunetheweb commented 4 years ago

So this appears to be working exactly as expected.

Technically yes. To the end user (me), no.

As I said above:

I suspect in a lot of cases they are poorly coded sites. But then why do they pass AMP validation? And if AMP is so brittle and HTML is so robust (all of the above sites load fine with non-AMP versions) then why is AMP better?

Should the AMP validator fail here when large parts of the page show as blank (95.07% of the page in this case) so it's not recognised as valid AMP and so it doesn't get used in SERPs and falls back to HTML version?

I know Google Search Console already flags some such issues as warnings and errors so should this be extended to this sort of thing?

Do other tools that serve AMP where available (e.g. Twitter) also only use AMP if they pass the validator?

kristoferbaxter commented 4 years ago

@westonruter

Can we open a separate issue on amp-wp regarding the highlighted issue with opacity and plugins that generate suboptimal output?

westonruter commented 4 years ago

@kristoferbaxter I'm not sure what can be done in this case? If there is functionality that doesn't have fallbacks for when JS is disabled, it seems unavoidable for it to also be broken in AMP. There's seems an endless number of ways that content can be broken which depends on scripts to initialize. There's the example here of an opacity:0 inline style, but it could be any other styles like visibility:hidden or display:none. Such styles could be conditionally applied as well based some class being present, for example:

...
<head>
    <style>
        body.loading #interactive { visibility:hidden }
    </style>
</head>
<body class="loading">
    <div id="interactive">...</div>
    <script>document.body.classList.remove('loading')</script>
</body>
...

When such a page is processed by the AMP plugin, the script would be removed and the div#interactive would never be made visible.

Ultimately if content does not have no-JS fallbacks it will be broken in both AMP and also in non-AMP when users have scripting turned off. The grid example is broken not only in terms of the opacity:0 but also in that the filtering tabs at the top of the grid also fail to work without scripting:

When scripting is not available, the grid should not have opacity:0 and the tabs should be hidden entirely.

I think the fix here is for the underlying page to implement graceful degradation for when scripting is not available. The AMP plugin does all it can do to support this graceful degradation, including the unwrapping of noscript elements so that their contents automatically become part of the AMP version.

kristoferbaxter commented 4 years ago

Could the Wordpress integration flag Wordpress plugins that create this kind of invalid output?

Additionally, when amp-wp removes script is there analysis done on the nature of the script?

On the root issue, @Gregable, could the Validator be improved in a meaningful way to classify these kinds of document issues?

westonruter commented 4 years ago

Could the Wordpress integration flag Wordpress plugins that create this kind of invalid output?

Yes, the WordPress plugin does flag the inclusion invalid scripts, with the caveat being that this is not currently enabled in Reader mode (which is being used here). The Reader mode involves using simplified mobile-friendly templates that have that blue header on the top. In Transitional mode and Standard mode, the active theme's templates and stylesheets are used to create AMP pages. (Transitional mode retains paired AMP model, whereas Standard mode is AMP-first.)

In Standard/Transitional mode, if you activated some plugin called “Filterable Table” which added a script to the page, you'd then see this in the WordPress admin bar:

Clicking the “Review 1 validation issue” link takes you to the Validated URL screen:

Notice the “Removed” status means that the invalid script is being removed, and thus any functionality it adds to the page is omitted. If the status is changed to “Kept” then the script is not removed and the page then becomes non-AMP (in Transitional mode it redirects to non-AMP, whereas in Standard mode the amp attribute is removed from the html element).

Expanding the validation error reveals details about which theme/plugin is responsible for adding it, as well as the URL to the script to inspect what behavior it is adding to the page:

Additionally, when amp-wp removes script is there analysis done on the nature of the script?

An analysis on the logic contained in the script is not performed. I'm not sure what could feasible be done in that regard. The validation results in the AMP plugin are intended to provide tools for users to debug why their pages may appear partially broken in AMP. The “Paired Browsing” mode provides an interface to view the AMP and non-AMP side-by-side to compare the difference in behavior (if any).

The only clear action item I can see here for the AMP plugin is to incorporate the validation error reporting in Reader mode. This would make sense to work on as part of https://github.com/ampproject/amp-wp/issues/2044.

kristoferbaxter commented 4 years ago

Fantastic analysis Weston, thank you.

Gregable commented 4 years ago

On the root issue, @Gregable, could the Validator be improved in a meaningful way to classify these kinds of document issues?

This is not a validator issue. The page as delivered to the validator includes CSS that hides most of the document. The fundamental problem is that we assume all documents can be converted from non-amp to amp in a completely automated way, which in the presence of a turing-complete language (javascript) is simply not a true assumption.

I see no complete solution that does not require human intervention to identify when the tooling fails.

Additional automated steps could be proposed which may solve this specific case, or even a class of cases. For example, a rendering step in Google Search which detects if a document is entirely a single color. However, it's entirely possible that the render is fine, but no links on the document now work, or there is an overlayed image that can't be dismissed, etc. In the end, this is an unsolvable problem without involving a human.

tunetheweb commented 4 years ago

The fundamental problem is that we assume all documents can be converted from non-amp to amp in a completely automated way, which in the presence of a turing-complete language (javascript) is simply not a true assumption.

I agree. But to me this is a failure of Google over-pushing AMP with enhanced SERP rewards, CMS's turning on AMP solutions by default and also site owners thinking this can be converted with no effort just by turning on a plugin.

It is also a failure of sites like Google and Twitter serving AMP by default, without giving the user the choice of opting out of AMP permanently. Google SERPS at least allow the quick option to eject out of AMP, but other clients (e.g. Twitter) do not.

AMP has made this problem - again the real HTML pages in these examples work - so I think it is poor to say there is no solution.

But I'm not here to (just!) moan, I've a suggested solution: the AMP validator apparently currently highlights issues like this:

So it already compares pages and reports on differences.

Why can a similar check not be added to the AMP validator which compares the two pages and if the AMP page has a large area of white space (say of at least 200px x 200px) and the real HTML page does not have any large areas of white space, then the AMP page is considered broken and fails validation and then should be ignored in preference to the real HTML version by all clients that prefer AMP (Google SERPs, Twitter...etc.)?

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

ampproject / amphtml