ampproject / amp-wp

Enable AMP on your WordPress site, the WordPress way.
https://wordpress.org/plugins/amp/
GNU General Public License v2.0
1.79k stars 382 forks source link

No CSS compatibility: characters allowed per spec are rejected #7238

Closed anrghg closed 1 year ago

anrghg commented 2 years ago

Bug Description

includes/sanitizers/class-amp-style-sanitizer.php:2150 has this regex to extract IDs: `/#([a-zA-Z0-9_-]+)/', resulting in not handling IDs containing (allowed) “ISO 10646 characters U+00A0 and higher”, so CSS targeting these is removed when AMP is on.

Since the page slug may be a body class, the same holds true for class names.

On the other hand, class names with a leading digit are kept, while the browser discards them per CSS.

This bug also occurs in WordPress Core and has now been reported.

I thougt the issue was only with the regexes fetching the classes and IDs, in includes/sanitizers/class-amp-style-sanitizer.php:2140..2158, but making these less specific did not fix it.

Expected Behaviour

Spec-conformant CSS classes and IDs are processed so that the “Page slug as a body class” feature works as intended also with Latin-1 and non-Latin slugs and with emoji in the slug, documented in https://anrghg.sunsite.fr/publishing-helper/features/helpers/#127-the-second-way-rules-are-added-to-custom-css-in-the-theme-customizer-or-in-a

Eventually, “insane” classes i.e. with a leading digit could be sanitized with a prepended underscore. However, that would require editing the rest of HTML as well, which might not be what the plugin is designed for. So ultimately, leading digits in selectors should cause the affected rules to be discarded as not actionable and thus useless.

Example pages

  1. In Twenty Twenty-Two: 1.1. With AMP: https://anrghg.sunsite.fr/test-amp-compat/64-characters-%e2%96%b6-css-allows-all-non-ascii-%f0%9f%98%91/ 1.2. Without AMP: https://anrghg.sunsite.fr/test-css-compat/64-characters-%e2%96%b6-css-allows-all-non-ascii-%f0%9f%98%91/
  2. In GeneratePress: 2.1. With AMP: https://anrghg.sunsite.fr/test-amp-gp-compat/64-characters-%e2%96%b6-css-allows-all-non-ascii-%f0%9f%98%91/ 2.2. Without AMP: https://anrghg.sunsite.fr/test-gp-compat/64-characters-%e2%96%b6-css-allows-all-non-ascii-%f0%9f%98%91/

Steps to reproduce

I wouldn’t suggest adding emoji in a CSS class on purpose, that would be really far-fetched. Where emoji in a CSS class name come into play is when the title contains any, the page slug supports them, and the slug is a class of the body. That is convenient as it makes customizing a page’s CSS more intuitive than using the post ID: https://www.wpbeginner.com/wp-themes/how-to-add-page-slug-in-body-class-of-your-wordpress-themes/

We do not need to edit functions.php when using the plugin https://wordpress.org/plugins/anrghg that comes with options to enhance slugs and add slugs to the body classes.

To test the non-ASCII character issue

  1. Follow the instructions on https://anrghg.sunsite.fr/publishing-helper/features/helpers/#127-the-second-way-rules-are-added-to-custom-css-in-the-theme-customizer-or-in-a
  2. Write a new page title with a leading digit and emoji or other non-ASCII characters in it.
  3. Add custom style rules targeting the page by using the page slug selected in the URL bar, with a prefixed underscore.
  4. Test with AMP and without AMP.

To test the leading-digit issue

Most easily the prefix may be configured to an empty string for test purposes in the setting documented on https://anrghg.sunsite.fr/publishing-helper/features/helpers/#127-slugs-starting-with-a-digit-0-9-or-with-a-hyphen-followed-by-a-digit-0-9-will

Then the test works with the page set up above, since v1.6.13 of the plugin.

Else using WordPress core features:

  1. Use a class with a leading digit in an HTML block;
  2. Style that class in Custom CSS;
  3. The style is in AMP output but does not apply because it is not sanitized.

PHP Version

8.1

Plugin Version

5.3.0

AMP plugin template mode

Standard

WordPress Version

n/a

Site Health

n/a

Gutenberg Version

n/a

OS(s) Affected

n/a

Browser(s) Affected

n/a

Device(s) Affected

n/a

Acceptance Criteria

No response

Implementation Brief

No response

QA Testing Instructions

No response

Demo

No response

Changelog Entry

No response

anrghg commented 2 years ago

The example pages worked yesterday, ,but today as I reverted the emoji in the title to the originally intended 😑, the pages — even newly created for the purpose — started throwing a 404 error.

While leaving the issue open since it is valid, I need to investigate and fix the bug causing the test failure.

anrghg commented 2 years ago

Example pages are up again, the bug is found and fixed. It is not about Unicode characters in slugs but Latin uppercase causing hard-to-predict 404 errors.