midospan / oohembed

Automatically exported from code.google.com/p/oohembed
Other
0 stars 0 forks source link

Wordpress responses are incorrectly escaped #4

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
It appears that oohembed's Wordpress.com responses are double escaped (or
something). Go to http://identi.ca/notice/7446621 and notice how the title
is wrong (it shows as "401 – What’s On Earth Tonight? «
Strange Maps"). Look at the JSON response at
http://www.oohembed.com/oohembed/?url=http://strangemaps.wordpress.com/2009/07/2
3/401-whats-on-earth-tonight/
it seems that the title is incorrect.

Original issue reported on code.google.com by candrews...@gmail.com on 3 Aug 2009 at 9:38

GoogleCodeExporter commented 9 years ago
BTW - here's the laconica ticket that caused me to discover this problem:
http://laconi.ca/trac/ticket/1761

Original comment by candrews...@gmail.com on 3 Aug 2009 at 9:51

GoogleCodeExporter commented 9 years ago
I don't think this is a oohembed bug. The original page uses HTML entities in 
the 
title element. The oembed response from oohembed carries those html entities 
as-is.

I thing the problem on the identi.ca page is that you aren't rendering those 
entities 
correctly. See this for what I mean: 
http://blog.rebeccamurphey.com/2007/12/19/html-
entities-from-ajax-into-input-fields-using-jquery/

Original comment by deepak.s...@gmail.com on 20 Aug 2009 at 2:24

GoogleCodeExporter commented 9 years ago
I disagree with your assessment.

According to the spec at http://www.oembed.com the title is text - not HTML, so 
it
should not have HTML entities in it.

I believe the proper, specification compliant solution is for the provider 
(oohembed)
to send only text, so in this case, you should be converting those html 
entities to
regular characters in oohembed.

Original comment by candrews...@gmail.com on 20 Aug 2009 at 3:40

GoogleCodeExporter commented 9 years ago
>> the title is text - not HTML, so it should not have HTML entities in it.

So out of the context of HTML, who's to say that – is a HTML entity and not
just funny looking text? Why are you assuming that oembed responses will be used
exclusively in the context of rendering HTML documents?

In any case, the HTML spec itself says that TITLE should be text but entities 
are
allowed.

Further, how do you propose handling cases like this? Should I just strip the
entities out? That may actually leave the title meaningless. Or should I 
maintain
some kind of conversion table and try converting every entity encountered into
suitable unicode?

I still think you should just take the suggestion in the link I provided in the
previous comment. If you want to discuss this further, please bring it up in the
oembed list. 

Original comment by deepak.s...@gmail.com on 5 Sep 2009 at 4:02