WordPress / gutenberg

The Block Editor project for WordPress and beyond. Plugin is available from the official repository.
https://wordpress.org/gutenberg/
Other
10.29k stars 4.11k forks source link

no_texturize_shortcodes WP filter broken with block themes #37754

Open casaschi opened 2 years ago

casaschi commented 2 years ago

Description

Instructed to open a bug report here from https://core.trac.wordpress.org/ticket/54614#comment:13

Hello, my wordpress plugin embed-chessboard uses the code below to avoid texturization of the text within a shortcode. In wordpess 5.9 with the Twenty Twenty-Two theme the filter does not seems to have effect with texturization applied to the text within the shortcode (particularly straight quotes " get changed into opening/closing quotes, breaking things apart).

function embedchessboard_no_texturize( $shortcodes ) {
  $shortcodes[] = 'pgn';
  $shortcodes[] = 'pgn4web';
  return $shortcodes;
}

add_filter( 'no_texturize_shortcodes', 'embedchessboard_no_texturize' );

More technical notes at the original WP bug report https://core.trac.wordpress.org/ticket/54614

Step-by-step reproduction instructions

Use 5.9 RC1.

  1. e4 c5 2. Nf3 e6 3. d4 cxd4 4. Nxd4 Nc6 5. Nb5 d6 6. c4 Nf6 7. N1c3 a6 8. Na3 d5 9. cxd5 exd5 10. exd5 Nb4
  2. Be2 Bc5 12. O-O O-O 13. Bf3 Bf5 14. Bg5 Re8 15. Qd2 b5 16. Rad1 Nd3 17. Nab1 h6 18. Bh4 b4 19. Na4 Bd6
  3. Bg3 Rc8 21. b3 g5 22. Bxd6 Qxd6 23. g3 Nd7 24. Bg2 Qf6 25. a3 a5 26. axb4 axb4 27. Qa2 Bg6 28. d6 g4 29. Qd2 Kg7 30. f3 Qxd6 31. fxg4 Qd4+ 32. Kh1 Nf6 33. Rf4 Ne4 34. Qxd3 Nf2+ 35. Rxf2 Bxd3 36. Rfd2 Qe3 37. Rxd3 Rc1 38. Nb2 Qf2 39. Nd2 Rxd1+ 40. Nxd1 Re1+ 0-1[/pgn]
    
    * Step 6: Publish the post.
    * Step 7: View the post. Notice how the game renders and is able to interact with the pieces.
    * Step 8: Activate the TT2 theme.
    * Step 9: Open the post in another browser tab for comparison. Notice that the interact and rendering are broken.

Screenshots, screen recording, code snippet

with TT1 theme:

test-report-59rc1-browsers

with TT2 theme: Screenshot from 2022-01-06 15-20-19

Environment info

WordPress: 5.9 RC1 (though happened on Beta versions too)

Please confirm that you have searched existing issues in the repo.

Yes

Please confirm that you have tested with all plugins deactivated except Gutenberg.

Plugin activated: Embed Chessboard. Occurs with and without Gutenberg plugin.

hellofromtonya commented 2 years ago

I can reproduce the issue on TT2. Works well in TT1.

Some context from the research and testing I did in the trac ticket (bringing it here)....

Thanks for the update. Yes, I can reproduce in TT2 which is a block theme.

What's known:

Some research:

I asked @casaschi to report it here as I couldn't find anything in Core or TT2 that is causing the issue. I'm wondering if something in the HTML sanitizing or RichText component might be contributing. Though I didn't do a deep dive, I didn't discover how the JS sanitizing is wired to wptexturize.

Mamaduka commented 2 years ago

It looks like wptexturize call in gutenberg_get_the_template_html function (function name in core get_the_block_template_html) is causing the issue.

At this point, the shortcode is already parsed by content block, and no_texturize_shortcodes has no effect.

https://github.com/WordPress/gutenberg/blob/c9425ae1236c0eefe9ba37c94e82aab8ca373338/lib/compat/wordpress-5.9/block-template.php#L217-L218

If I remove the wptexturize call or move it before the do_blocks shortcode, it works as expected.

I think moving it before do_blocks is the safest fix. The post content block texturizes its content, so I assume this call is needed for template markup.

Cc @felixarntz.

casaschi commented 2 years ago

Thanks for debugging this issue.

I tried on my test site with WP 5.9RC and I can confirm that moving wptexturize() before do_blocks() solves the issue. Good catch!

Thanks again.

noisysocks commented 2 years ago

@youknowriad @ockham: Is it harmless to move wptexturize to be before do_blocks?

carlomanf commented 2 years ago

Is it harmless to move wptexturize to be before do_blocks?

The filters the_content and widget_block_content have do_blocks at priority 9 before any other filters, so any blocks with "texturizable" characters could potentially behave differently in a template versus in post content or widget area.

I like the idea of a solution that applies wptexturize, do_shortcode and similar filters to specific blocks that need them (paragraph block, classic block, shortcode block, heading block, etc) rather than applying them across all blocks as a blanket. I am reminded of the approach taken by #28405:

In contrast with the aforementioned pull requests, this builds on the opinion that any block type capable of recursion should be responsible for its own infinite loops. In other words, it isn't up to the block editor framework to detect and prevent block loops.

casaschi commented 2 years ago

Any progress on this? WP 5.9.1 is out but this issue has not been addressed yet.

casaschi commented 2 years ago

WP 5.9.2 out but this issue has not been fixed yet :-(

casaschi commented 2 years ago

WP 5.9.3 did not fix this either.

casaschi commented 2 years ago

WP 6.0 still did not fix this.

kylemilloy commented 2 years ago

Can confirm...still waiting for a patch here.

casaschi commented 2 years ago

This bug has been reported for several months; since then, a couple of WP releases did not address it despite the root cause for the issue being quickly identified within the wptexturize call.

I could easily bypass the issue by manually reversing the unwanted changes affecting my plugin; I have been hesitant doing so because the issue should rather be addressed where it occurrs.

Could anyone from the WP and/or Gutenberg developers teams please advise when this is going to be fixed? If no suitable fix is expected anytime soon I would reluctantly implement an ugly workaround for my plugin rather than advising to avoid block based themes as I'm recommending at the moment to the users of my plugin.

Thanks for the feedback.

carlomanf commented 1 year ago

@casaschi Sorry for the wait and thank you for following up. There is a new patch for this at #44995 if you are interested to test it. No idea when or if it will be released.

RensTillmann commented 1 year ago

Since it's still not patched, I am currently forced to use this due to the nature of how the shortcode uses JSON code inside it's content. WP decided to start replacing straight quotes with smart quotes for some themes, and some others not, no clue. All tests where done with the dedicated shortcode block element provided by WP.

The below just disables it globally, which works for me, nothing else worked. I tried to remove the filter specifically for the content itself, but it didn't work. I guess it won't harm anything, except that "smart" quotes are better from a readability point of view. No idea if disabling it globally has any other impacts.

add_filter( 'run_wptexturize', '__return_false', 9999999 );

casaschi commented 1 year ago

@RensTillmann quite a shame ignoring this issue and leaving the no_texturize_shortcodes filter broken.

I found a different workaround, it works for me but it might not be suitable for everyone: it seems wordpress does not texturize text within certain html tags, for example anything in a PRE tag is not texturized. It works for my plugin to enclose certain text in an otherwise unnecessary PRE tag and make sure no texturization happens.

Hopefully this bug will be fixed soon

eliot-akira commented 1 year ago

This is a larger problem than the original specific issue (shortcode result being texturized with block themes, even though it's added in no_texturize_shortcodes).

The root cause that should be addressed is that there's a whole stack of content filters, including wptexturize and do_shortcode, that is always applied to the entire HTML result after all blocks have been rendered.

It introduces subtle bugs, for example, the one that led me to this GitHub issue discussion (and similar to what @RensTillmann mentioned): some blocks or shortcodes render a JSON string, a piece of HTML, or any other kind of text that must not be modified, which can get corrupted unexpectedly by one of the filters.

The way it currently works, there's no way for block authors to opt-out of these filters; the most they can do is to work around or reverse their effects. A better design would be to not process the resulting HTML of a block by default, except for certain blocks where the_content filter makes sense. As @carlomanf said:

I like the idea of a solution that applies wptexturize, do_shortcode and similar filters to specific blocks that need them (paragraph block, classic block, shortcode block, heading block, etc) rather than applying them across all blocks

I understand that would be a backward-incompatible change, and unlikely to be implemented in an ideal way. In that case, perhaps there can be an "escape hatch" for block authors to wrap their HTML somehow, or an option in register_block_type(), to protect it from getting processed by these unwanted content filters.

asjl commented 2 months ago

Still not fixed in 6.5.4 using the TwentyTwentyThree theme.

I'm hitting a problem with ABC music data in a shortcode that makes use of a