esrahofstede / google-gdata

Automatically exported from code.google.com/p/google-gdata
0 stars 0 forks source link

Blogger content appears to be HtmlDecoded twice? #399

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
I'm pulling down the content from my blog using the .NET api.  I have content 
in the blog that looks like this:
<span class="kwrd">&lt;</span>

When I look at the AtomContent.Content I see:
<span class="kwrd"><</span>

I put a break point where the content is getting set in AtomFeedParser's 
ParseContent:
content.Content = Utilities.DecodedValue(reader.ReadString());

The result from ReadString actually contains the non-decoded content I see in 
the editor.  So the Utilities.DecodedValue isn't needed?

Thanks,
Russ

Original issue reported on code.google.com by russell....@gmail.com on 30 Jun 2010 at 12:21

GoogleCodeExporter commented 9 years ago
Some more information...

The change was made with r224, where, among other things, a lot of values in 
atomfeedparser.cs got wrapped with Utilities.DecodedValue.  The text is already 
decoded once via ReadString(), so it should not need to be decoded again.

In Blogger's case, the offending code is at 
http://code.google.com/p/google-gdata/source/browse/trunk/clients/cs/src/core/at
omfeedparser.cs#1135.  It might also be worth looking into the other instances 
of Utilities.DecodedValue(reader.ReadString()) as well, as I imagine the same 
problem of double decoding exists.

Original comment by ron...@gmail.com on 23 Nov 2010 at 6:40

GoogleCodeExporter commented 9 years ago
Issue 471 has been merged into this issue.

Original comment by ccherub...@google.com on 30 Jan 2011 at 8:31

GoogleCodeExporter commented 9 years ago
Is there any way to extract unencoded content? 

I'm trying to display HTML from a blog but it is always encoded.

Original comment by amdonnelly@gmail.com on 28 Feb 2011 at 8:35