humanmade / WordPress-Importer

In-development rewrite of the WordPress (WXR) Importer
Other
358 stars 63 forks source link

Prevent backslashes from being stripped on post insertion #172

Closed gtuser10 closed 1 year ago

gtuser10 commented 4 years ago

In content created with Gutenberg editor, some characters (used in block attribute) are encoded into unicode character codes that start with a backslash, e.g.

<!-- wp:plugin/custom-block {"data":"Some text and \u003ca href=\u0022https://example.com/\u0022\u003esome-link\u003c/a\u003e."} /-->

Before inserting into database, wp_insert_post() function runs the post object through wp_unslash() function which strips all backslashes. This breaks unicode character codes (and content). To mitigate that, we should run the post object through wp_slash() functions before passing it to wp_insert_post(). The original WordPress Importer does this.

JiveDig commented 3 years ago

I've been having failing imports on some themes lately and finally tracked it down to this same issue. I can confirm that adding the following fixed it for me.

add_filter( 'wp_import_post_data_processed', function( $postdata, $data ) {
    return wp_slash( $postdata );
});
JiveDig commented 3 years ago

Actually, this seems to be a duplicate of https://github.com/humanmade/WordPress-Importer/pull/161

rmccue commented 1 year ago

Thanks! Closing in favour of #174 for a best practice approach.