WordPress / gutenberg

The Block Editor project for WordPress and beyond. Plugin is available from the official repository.
https://wordpress.org/gutenberg/
Other
10.57k stars 4.22k forks source link

  in Post Title is invisible and can break content #66653

Open MadtownLems opened 1 month ago

MadtownLems commented 1 month ago

Description

One of our content providers just reported that their long post title wasn't wrapping lines on mobile. Looking at it, we saw   in between each word of the title. She had unknowingly pasted in something like "My Post Title". As a result, it wouldn't wrap lines (among other problems).

I'm not actually sure how this should best be handled, but the current implementation is very confusing for someone. There's really no way for her to know that the problem existed. Perhaps it should be rendering the entity in the title in the editor?

Step-by-step reproduction instructions

  1. Copy My Post Title and Paste it into a Post Title
  2. Note how  is invisible in the editor, even if you switch to code view
  3. Publish the post, and view it. Notice how it looks like normal spaces.
  4. View the source and see how it actually contains   in between each word.
  5. Edit the post, and see that you still cannot tell   is there
  6. Go to your All Posts screen, and see that you STILL can't tell that   is there.

Screenshots, screen recording, code snippet

No response

Environment info

WordPress 6.7-RC-2, no Gutenberg Plugin. (Also confirmed on 6.6.2, no Gutenberg)

Please confirm that you have searched existing issues in the repo.

Please confirm that you have tested with all plugins deactivated except Gutenberg.

Please confirm which theme type you used for testing.

im3dabasia commented 2 weeks ago

I was able to reproduce this issue, and it only occurs in the title (Heading) block.

If we paste 'My Post Title' into the Paragraph, List, or Button blocks (I tested these three), it pastes as 'My Post Title'and keeps the   entities in both the editor and frontend.

I tried this string 'My Post Title © € ®' which has other html entities to check how it behaves in the frontend. Here in this case the final output is attached below. Apart from the ' ' other html entities, were converted to there respective symbols.

im3dabasia commented 2 weeks ago

@MadtownLems , After debugging, I found that while the title converts HTML entities when pasted into the editor, in the database, a special space character (\xc2\xa0, non-breaking space) is stored. This special character appears to be causing the issue. It looks like the space is treated differently than regular spaces.

A potential fix could be to save the title in the database without the \xc2\xa0or Handle it when displaying the frontend output.

I tried multiple methods to handle this, including strip_tags, html_entity_decode, htmlspecialchars_decode, mb_convert_encoding, and str_replace.

Among these, the following worked: $title = str_replace( "\xc2\xa0", ' ', $title );

However, I don’t believe this is the best solution for handling the issue, and it might not be the most optimal long-term approach.

Thank you, Hopefully, this provides some insight into the issue.