Open pjeby opened 5 years ago
I just tried the image URL you provided. Indeed '&' is encoded as '\&' when the URL is saved as GUID in database. But seems that the image can still be correctly displayed when the media is inserted in post, even with Elementor.
Can you provide some more detailed steps for me to reproduce the issue?
That's odd - try this URL instead: https://pbs.twimg.com/media/D_NKa3yWkAYwZwn?name=900x900&format=png -- I just noticed that for some reason the unsplash one is now working without the filter, but this one is not. I wonder if unsplash deliberately changed their URL handling to support miscoded URLs?
You are right. The URL from pbs.twimg.com does not work. It is possible that the web server of the external image deliberately handles encoded URLs. WordPress plugins should not rely on that.
Also, I just figured out why the unsplash one is "working" -- it is simply ignoring all the parameters that begin with amp;
and displaying the default full-size image. So it only appears to work.
After some investigation, I found that &
is not converted to &
by htmlentities
, i.e. the opposite of html_entity_decode
. Instead, it is converted by wp_filter_kses
which is a WordPress function acting as a pre_post_guid
filter to sanitize post GUIDs on both save and display.
wp_filter_kses
eventually calls wp_kses_normalize_entities
, which converts AT&T
to AT&T
, :
to :
, &#XYZZY;
to &#XYZZY;
and so on.
Therefore, I think using html_entity_decode
to fix the issue might introduce other problems, as it is not the exact opposite of wp_kses_normalize_entities
.
The post GUID sanitization was introduced in a commit in 2011. It is so long ago that seems that no one knows the reason why it is needed, even if I've asked the WordPress core team in their Slack channel.
I think a better solution might be to remove the filter before external media links are added and resume it after the links are added.
But that means the links already added before the fix are applied need to be removed and added again.
There is a reason -- sort of -- that it's needed, or at least that gets it used. Post GUIDs are output in the RSS feeds without any escaping, but the & signs need to be escaped for proper XML. So, it is actually correct, in that sense, sort of, even though really the escaping should be done by the RSS template, not in the database. (But it's way too late to fix that in WP, of course.)
Rather than remove the filter and replace it, though, it may be simpler to just temporarily add a high-priority filter that returns the desired value, since then it doesn't matter if another plugin is messing with those filters, too.
(But that's just me thinking out loud because I need to change Imposer and Postmark to deal with this issue as well. Right now I haven't used any GUIDs with &
in them for those tools, but it will become a major issue for anyone who does.)
I see.
I did some experiment and inspected the core source code, and found that in fact WordPress core does convert &
to &
while exporting RSS2 feed, even if I changed the &
back to &
via MySQL client. The conversion is done by function wptexturize
in wp-includes/formatting.php
. The function is added as the default the_content
filter to be called on RSS2 export. So seems that removing the wp_kses_normalize_entities
filter will be fine.
But I haven't tried other feed formats yet. And I think your consideration makes sense. It is safer not to remove the filter added by core.
As far as I know, if the URL of an external media contains &
and the media is added by External Media without Import, the only places where the media cannot be correctly displayed are:
I haven't found other places where the media can't be correctly displayed.
Yes, the media can be correctly displayed when the post is published and viewed by non-admin users.
I shall try to find a solution which does not remove the default filters and makes the media correctly displayed in the Media Library page and the post editor page.
The place I discovered it didn't work was as a background image in Elementor, as entities are not valid in CSS, and the URL is used in Elementor-generated CSS.
For Elementor I've ended up using a different solution for inserting external images, and I now use this bit of code in Imposer and Postmark to detect altered GUIDs and force their db content.
So if you decide not to support such use cases, it won't actually affect me at this point, as I ended up using my "paste URL in the search box" hack as my way of using external images with Elementor, rather than this plugin. Thanks for investigating anyway, and for inspiring said URL-paste hack, which I wouldn't have known how to write without studying your code.
Because of the way post GUIDs are escaped by WordPress, an image URL like https://images.unsplash.com/photo-1491895200222-0fc4a4c35e18?ixlib=rb-1.2.1&ixid=eyJhcHBfaWQiOjEyMDd9&auto=format&fit=crop&w=2767&q=80 gets turned into https://images.unsplash.com/photo-1491895200222-0fc4a4c35e18?ixlib=rb-1.2.1&ixid=eyJhcHBfaWQiOjEyMDd9&auto=format&fit=crop&w=2767&q=80 in the GUID, which leads to the wrong image URL being used by the media display and other functions (e.g. Elementor).
For my own purposes, I'm working around this with:
But a more robust version might look like:
Along with changing the current
get_attached_file
filter to decode theguid
before returning it.