WordPress / gutenberg

The Block Editor project for WordPress and beyond. Plugin is available from the official repository.
https://wordpress.org/gutenberg/
Other
10.03k stars 4.03k forks source link

The `&` symbol becomes `amp` in the draft URL #62543

Open marcarmengou opened 1 month ago

marcarmengou commented 1 month ago

Description

When you type Ampersand & And as the title of an Post and save the Post as a draft, you will see that the URL becomes /ampersand-amp-and/. If you publish the Post, the URL becomes /ampersand-and/.

I was originally going to write this issue asking if it was necessary to add amp to the URL, because from an SEO point of view it makes no sense and just unnecessarily lengthens the URL. However, during initial testing, when publishing the Post, the URL became /ampersand-and/.

Now, I wonder if this may cause confusion to other people. If this is a bug or if it makes any sense for amp to appear in the URL, even if only in drafts. Also, it is not the same with %, $, #, @, or others. For example, if you write Ampersand & And % One # Two $ Three @ Four the URL of the draft will be /ampersand-amp-and-one-two-three-four/

Step-by-step reproduction instructions

  1. Go to Settings > Permalinks and set Post name as a permalink structure.
  2. Create a Post and type Ampersand & And as the Post title.
  3. Save the Post as a Draft.
  4. In the editor sidebar click on the URL to view the generated URL. It is this: /ampersand-amp-and/
  5. Publish the Post.
  6. Chceck URL of the Post. It is this: /ampersand-and/

Screenshots, screen recording, code snippet

ampersand-and

Environment info

No response

Please confirm that you have searched existing issues in the repo.

Yes

Please confirm that you have tested with all plugins deactivated except Gutenberg.

Yes

Soean commented 1 month ago

If I call wp.url.cleanForSlug('Ampersand & And'), it returns the correct value ampersand-and. But somewhere is is missing.

t-hamano commented 1 month ago

I'm not an expert on this kind of processing, but this slug is generated by the getEditedPostSlug selector:

https://github.com/WordPress/gutenberg/blob/ee30a87abd733d00338730eed831cfc02d27edf0/packages/editor/src/store/selectors.js#L977-L983

During this process, the string obtained by getEditedPostAttribute( state, 'title' ) seems to have its HTML entities encoded.

Therefore, this can be resolved by making the following changes, but I'm not sure how this will affect other processing.

diff --git a/packages/editor/src/store/selectors.js b/packages/editor/src/store/selectors.js
index 8b0dfd4b37..f7a990bd8b 100644
--- a/packages/editor/src/store/selectors.js
+++ b/packages/editor/src/store/selectors.js
@@ -16,6 +16,7 @@ import { layout } from '@wordpress/icons';
 import { store as blockEditorStore } from '@wordpress/block-editor';
 import { store as coreStore } from '@wordpress/core-data';
 import { store as preferencesStore } from '@wordpress/preferences';
+import { decodeEntities } from '@wordpress/html-entities';

 /**
  * Internal dependencies
@@ -977,7 +978,9 @@ export function getPermalink( state ) {
 export function getEditedPostSlug( state ) {
        return (
                getEditedPostAttribute( state, 'slug' ) ||
-               cleanForSlug( getEditedPostAttribute( state, 'title' ) ) ||
+               cleanForSlug(
+                       decodeEntities( getEditedPostAttribute( state, 'title' ) )
+               ) ||
                getCurrentPostId( state )
        );
 }