WordPress / gutenberg

The Block Editor project for WordPress and beyond. Plugin is available from the official repository.
https://wordpress.org/gutenberg/
Other
10.32k stars 4.12k forks source link

Inline HTML elements containing only whitespace are stripped from pasted content #50898

Open gregsullivan opened 1 year ago

gregsullivan commented 1 year ago

Description

When pasting content into the block editor, elements containing only whitespace are removed. For example, the following pasted line:

The following space:<em> </em>will be <em>removed</em>.

is modified as follows on paste:

The following space:will be <em>removed</em>.

Stripping the tags makes sense to me, but I believe the whitespace should be retained.

I am encountering this on a magazine website where the content is pasted from Microsoft Word. The document is full of <i> elements containing only spaces, remnants of the editing process. These are removed on paste, merging words throughout the document.

Step-by-step reproduction instructions

  1. Go to the block editor.
  2. Paste the following: The following space:<em> </em>will be <em>removed</em>.
  3. Note that the space was removed.

Screenshots, screen recording, code snippet

No response

Environment info

Please confirm that you have searched existing issues in the repo.

Yes

Please confirm that you have tested with all plugins deactivated except Gutenberg.

Yes

gregsullivan commented 1 year ago

You may need to paste the example sentence into a text editor and then into Gutenberg to ensure the HTML tags are parsed and not simply displayed.

kathrynwp commented 1 year ago

Hey there @gregsullivan - would it be feasible to clean up your Word documents to strip out the empty tags before pasting it into the editor?

gregsullivan commented 1 year ago

Hello @kathrynwp!

It would be feasible for me to clean up my client's Word documents, but not feasible for my client without adding meaningful time to the process of publishing articles on their website. They will be maintaining the site themselves beginning in July.

I don't expect this to be fixed by then, so I'm planning to write up instructions for how to paste from Microsoft Word to WordPress, covering:

They are non-technical users, so this is asking a fair bit of them.

ellatrix commented 1 year ago

Hi! Could you share what is logged in the console when you paste the content? https://balsamiq.com/support/faqs/browserconsole/

gregsullivan commented 1 year ago

Hi Ella! Here you go:

From use-paste-handler.js:103:

Received HTML:

From use-paste-handler.js:104:

Received plain text:

 The following space:<em> </em>will be <em>removed</em>.

From paste-handler.js:64:

Processed inline HTML:

 The following space:will be <em>removed</em>.

My environment has updated to:

Thanks for looking at this, and for all your work on WordPress 6.3 (and beyond)!