joomla / joomla-cms

Home of the Joomla! Content Management System
https://www.joomla.org
GNU General Public License v2.0
4.75k stars 3.64k forks source link

Facebook fails to retrieve Joomla 4 images, if the filename is not ASCII #40222

Open pkExec opened 1 year ago

pkExec commented 1 year ago
          add an image to an article. share the article on facebook. facebook uses the image. 

your problem is?

Originally posted by @brianteeman in https://github.com/joomla/joomla-cms/issues/35871#issuecomment-950396441

Actually that's the case only for "normal" filenames.

However, I just noticed that if the filename contains non-ASCII characters, for example Greek: "εικόνα1.jpg", then even though the image is displayed properly, Facebook fails to grab it, and returns net:ERR_HTTP2_PROTOCOL_ERROR 200.

Tested via https://developers.facebook.com/tools/debug and confirmed the case for Greek characters. An example of the failed encoding that Facebook chokes on (sitename changed to "samplesite.gr" for privacy reasons.):

[h]ttps://z-p3-external.fath7-1.fna.fbcdn.net/emg1/v/t13/14286829027119499094?url=https%3A%2F%2Fsamplesite.gr%2Fimages%2Farticles%2Flogos%2F%CE%BB%CE%BF%CE%B3%CF%8C%CF%84%CF%85%CF%80%CE%BF_8317c.png%23joomlaImage%3A%2F%2Flocal-images%2Farticles%2Flogos%2F%CE%BB%CE%BF%CE%B3%CF%8C%CF%84%CF%85%CF%80%CE%BF_8317c.png%3Fwidth%3D885%26height%3D233&fb_obo=1&utld=samplesite.gr

Original url:

[h]ttps://samplesite.gr/images/articles/logos/λογότυπο_8317c.png#joomlaImage://local-images/articles/logos/λογότυπο_8317c.png?width=885&height=233

I'm not sure if this a Joomla issue (meaning it should do some encoding on non-ASCII filenames) or a Facebook crawler issue.

PS: the [h] in the url is on purpose, to avoid linking.

brianteeman commented 1 year ago

Facebook does not accept filenames that are not ascii

https://developers.facebook.com/support/bugs/1398955717176696/?join_id=f13cfa4404439d4

Please note that the reported ASCII character is not in the IEFT's list of supported characters for URLs. Hence, the the behaviour is by design. I am sorry for any disappointment this may cause. I do understand how important this is for you, and I made sure to share your feedback with the team as well.

We will keep tracking the issue internally for a possible fix in the future. I hope this unblocks you. We are always looking for ways to do better and I will really appreciate if you could share your feedback via our survey!

pkExec commented 1 year ago

@brianteeman, however facebook actually does support non-ascii filenames: if you do a test with a Joomla-3 article, which doesn't have metadata encoded in the image path, it works fine.