charu861991 / google-gdata

Automatically exported from code.google.com/p/google-gdata
0 stars 0 forks source link

blog content encoded improperly. AtomEntry.Content.Content #232

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. Manually HTML-Encode an XSLT file and post to a blog (Eg.
http://prajnodotcom.blogspot.com/2009/04/creating-dynamic-controls-with-xmlxslt.
html)
2. Retrieve content through API -   foreach (AtomEntry blogEntry in
blogAtom.Entries) ->  the blogEntry.Content.Content is encoded incorrectly
so that the xslt information doesn't display properly.

What is the expected output? What do you see instead?
I see the post properly in blogspot.  However, retrieving the content
through the API doesn't encode the content properly.

What version of the product are you using? On what operating system?
Windows XP SP2.  The newest API -         Google Data API Setup(1.4.0.2).msi     

Please provide any additional information below.

Original issue reported on code.google.com by pra...@gmail.com on 6 Apr 2009 at 12:16

GoogleCodeExporter commented 8 years ago
Not sure if you are describing a .NET specific issue here. Is this already 
encoded incorrectly on the wire? 

Frank

Original comment by fman...@gmail.com on 14 Apr 2009 at 1:18

GoogleCodeExporter commented 8 years ago
The blog content is encoded properly - and displays properly in blogspot.  When 
I 
pull the same blog through the API, it isn't encoded properly/fully - so I 
cannot 
wrap the content with a Server.HtmlEncode statement.

The blog content on blogspot is here:
http://prajnodotcom.blogspot.com/2009/04/creating-dynamic-controls-with-xmlxslt.
html 

Since the blog is public, you can see how it gets pulled up through the API.  
The 
content gets improperly pulled - this can be found on my site: 
http://prajno.com/aspdotnet/Blog.aspx - scroll down to the same post title.  
All 
other blog posts seem to pull up fine.

Original comment by pra...@gmail.com on 14 Apr 2009 at 4:18

GoogleCodeExporter commented 8 years ago
That's not what i mean. What i was wondering if blogger chooses to decode 
whatever you send it to whatever 
they want to store. And then, when you ask for the content using the API, you 
get that. Meaning, this could be 
a service issue, or a client library issue. 

But i looked at your site etc, and i think i know what the problem is. Whenever 
you post something with TAGS 
(xsl, xml etc) to blogger, blogger goes ahead and entity encodes them, so 
creates < etc. 

When .NET code reads this during XML processing, .NET automatically converts 
that back. I know, not what 
you want here, but quiet useful in a lot of other scenarios. 

So you pretty much need to re-encode the content before you paste it into the 
HTML page. I had a similar 
report in the past, and, to my knowledge there is nothing i can do here to fix 
that in a way that it would not 
screw other people up. Example: you post something that says: This & that. 
Blogger stores this as This & 
that. If you read it back later, what do you want? 

Rephrased, in this scenario, you need to be save, and encode before embedding.

Frank

Original comment by fman...@gmail.com on 14 Apr 2009 at 4:42

GoogleCodeExporter commented 8 years ago
This was the XSL content posted:
<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
xmlns:asp="remove">
  <xsl:output method="xml" indent="yes" encoding="utf-
8" omit-xml-declaration="yes" />
  <xsl:template match="/">
    <xsl:apply-templates select="form"></xsl:apply-templates>
  </xsl:template>
  <xsl:template match="form">
    <H2>
      <xsl:value-of select="@form_description

But when I get it from the API:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
xmlns:asp="remove"><br />  <xsl:output method="xml" indent="yes" 
encoding="utf-8" 
omit-xml-declaration="yes" /><br />  <xsl:template match="/"><br />    
<xsl:apply-
templates select="form"></xsl:apply-templates><br />  </xsl:template><br />  
<xsl:template match="form"><br />    <H2><br />      <xsl:value-of 
select="@form_description" /> ...

When the blog renders on blogspot, the encoded html renders without being 
decoded, 
but when I receive the content via the API, the html is decoded - which is why 
the 
display gets messed up on my site.

Now - to be safe, I can encode the content before embedding - however, this 
will 
encode all the blog content - including the <br/> tags and everything that 
blogger 
sends back.  The only workaround option i see is if I can receive the 
pre-decoded 
content back.  Any ideas?

My blogs aside, how would one pull up a blog that has encoded content in it 
through 
the API? - Or how can you post encoded content so that it may not be decoded?  
If 
this is not possible, I can work around with images or something else I guess.

Thank you for all your help so far.

Original comment by pra...@gmail.com on 14 Apr 2009 at 6:21

GoogleCodeExporter commented 8 years ago
This is still an issue.

If you post a blog (edit html mode) with the following:
<xml></xml><br/><xml></xml>
You expect the following output (which is correct in blogger):
<xml></xml>
<xml></xml>

However, if you get the blog through the API via content.content the <xml> tags 
are
decoded:
<xml></xml><br/><xml></xml>

This leaves you with the problem of not being able to Server.EncodeHTML since 
you
will end up with all your html being encoded (not just the <xml> tags which 
need to
be done).  I tried multiple ways and no luck.  Any suggestions?

Original comment by pra...@gmail.com on 17 Jun 2009 at 11:17

GoogleCodeExporter commented 8 years ago
Give me more.

Are you saying that you can create the correct blogger content (via api or UI)? 
And if you retrieve it, the property 
of content.content is decoded, hence you are having a problem?

If so, what is actually send over the wire by blogger? Can you put a wiretrace 
of that return in here?

Frank

Original comment by fman...@gmail.com on 18 Jun 2009 at 9:11

GoogleCodeExporter commented 8 years ago
That is exactly the issue I'm facing.  To avoid confusion, I took screenshots 
as well.

The correct blogger content is entered via Blogger UI (BlogSpot).  It previews
correctly and displays on BlogSpot correctly.  When I retrieve it via API, it 
decodes
the <xml></xml> tags.  I've attached a screenshot of the return text (debug 
mode). 
Is this enough information to start?  I am just using the .dll files of the API 
and
cannot dig down further in the source code of the API.  Thank you.

Original comment by pra...@gmail.com on 19 Jun 2009 at 6:41

Attachments:

GoogleCodeExporter commented 8 years ago
I wasn't able to do a trace of the exact information returned, but the content 
for
the post returned from the atom is attached in the screenshot (<xml> is returned
instead of the encoded <xml> value).

Original comment by pra...@gmail.com on 19 Jun 2009 at 7:01

Attachments:

GoogleCodeExporter commented 8 years ago
the problem is, that this does not help me yet.. i have no problem believing 
that this is what the property shows. 
But is blogger returning this when you ask the API? Meaning, does it send this 
over the wire?

Get Fiddler (www.fiddlertool.org i think). That make it easy to observe what is 
actually getting send.

Original comment by fman...@gmail.com on 19 Jun 2009 at 7:33

GoogleCodeExporter commented 8 years ago
I will try Fiddler and post the HTTP trace shortly.  Thank you.

Original comment by pra...@gmail.com on 19 Jun 2009 at 7:39

GoogleCodeExporter commented 8 years ago
Response from the atom = myService.Query(query); statement execution:

The original blog content (post: Test Encoding) is received from Blogger:

<title type='text'>Test Encoding...</title><content
type='html'>&lt;xml&gt;&lt;/xml&gt;<br/>&lt;xml&gt;&lt;/xml&gt;<div
class="blogger-post-footer"><img width='1' height='1'
src='https://blogger.googleusercontent.com/tracker/539865380447153126-2401390322
204149377?l=prajnodotcom.blogspot.com'/></div></content>

Full Response is attached.

Original comment by pra...@gmail.com on 19 Jun 2009 at 7:50

Attachments: