thanhtin61294 / google-gdata

Automatically exported from code.google.com/p/google-gdata
0 stars 0 forks source link

Xml parsing error for collections with non-Roman characters #573

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
I'm reporting a bug with the GData .NET library that occurs on some character 
sets, but only on certain queries.

Here's what happens 
1. I need to get a list equivalent to "Collections shared with me" on the web 
browser.
2. I query Google docs with
    GET /feeds/default/private/full/folder%3Aroot/contents/-/folder?showroot=true
    (I've tried other variations, too, adding q=* , Content type not "charset=UTF-8" etc)
3. The server replies with the list of collections I want.
4. For some users I get
Google.GData.Client.ClientFeedException: Parsing failed ---> 
System.Xml.XmlException: Invalid character in the given encoding. Line 1, 
position xxxx.

5. This only happens with non-English users who have accents on letters in 
collection names  (mostly French, some Hungarian).
6. I don't see an invalid character at position xxxx

This might be similar to this old issue 
http://code.google.com/p/google-gdata/issues/detail?id=248

Full stack trace
Inner: Google.GData.Client.ClientFeedException: Parsing failed ---> 
System.Xml.XmlException: Invalid character in the given encoding. Line 1, 
position 5424.
   at System.Xml.XmlTextReaderImpl.Throw(Exception e)
   at System.Xml.XmlTextReaderImpl.Throw(String res, String arg)
   at System.Xml.XmlTextReaderImpl.Throw(Int32 pos, String res)
   at System.Xml.XmlTextReaderImpl.InvalidCharRecovery(Int32& bytesCount, Int32& charsCount)
   at System.Xml.XmlTextReaderImpl.GetChars(Int32 maxCharsCount)
   at System.Xml.XmlTextReaderImpl.ReadData()
   at System.Xml.XmlTextReaderImpl.ParseAttributeValueSlow(Int32 curPos, Char quoteChar, NodeData attr)
   at System.Xml.XmlTextReaderImpl.ParseAttributes()
   at System.Xml.XmlTextReaderImpl.ParseElement()
   at System.Xml.XmlTextReaderImpl.ParseElementContent()
   at System.Xml.XmlTextReaderImpl.Read()
   at System.Xml.XmlTextReader.Read()
   at System.Xml.XmlLoader.ReadCurrentNode(XmlDocument doc, XmlReader reader)
   at System.Xml.XmlDocument.ReadNode(XmlReader reader)
   at Google.GData.Client.BaseFeedParser.OnNewExtensionElement(XmlReader reader, AtomBase baseObject)
   at Google.GData.Client.AtomFeedParser.ParseExtensionElements(XmlReader reader, AtomBase baseObject)
   at Google.GData.Client.AtomFeedParser.ParseEntry(XmlReader reader)
   at Google.GData.Client.AtomFeedParser.ParseSource(XmlReader reader, AtomSource source)
   at Google.GData.Client.AtomFeedParser.ParseFeed(XmlReader reader, AtomFeed feed)
   at Google.GData.Client.AtomFeedParser.Parse(Stream streamInput, AtomFeed feed)
   --- End of inner exception stack trace ---
   at Google.GData.Client.AtomFeedParser.Parse(Stream streamInput, AtomFeed feed)
   at Google.GData.Client.AtomFeed.Parse(Stream stream, AlternativeFormat format)
   at Google.GData.Extensions.FeedLink.ParseFeedLink(XmlNode node)
   at Google.GData.Extensions.FeedLink.CreateInstance(XmlNode node, AtomFeedParser parser)
   at Google.GData.Client.AtomEntry.Parse(ExtensionElementEventArgs e, AtomFeedParser parser)
   at Google.GData.Client.AtomFeed.OnNewExtensionsElement(Object sender, ExtensionElementEventArgs e)
   at Google.GData.Client.ExtensionElementEventHandler.Invoke(Object sender, ExtensionElementEventArgs e)
   at Google.GData.Client.AtomFeed.OnNewExtensionElement(Object sender, ExtensionElementEventArgs e)
   at Google.GData.Client.BaseFeedParser.OnNewExtensionElement(XmlNode node, AtomBase baseObject)
   at Google.GData.Client.BaseFeedParser.OnNewExtensionElement(XmlReader reader, AtomBase baseObject)
   at Google.GData.Client.AtomFeedParser.ParseExtensionElements(XmlReader reader, AtomBase baseObject)
   at Google.GData.Client.AtomFeedParser.ParseEntry(XmlReader reader)
   at Google.GData.Client.AtomFeedParser.ParseSource(XmlReader reader, AtomSource source)
   at Google.GData.Client.AtomFeedParser.ParseFeed(XmlReader reader, AtomFeed feed)
   at Google.GData.Client.AtomFeedParser.Parse(Stream streamInput, AtomFeed feed)
 Source: Google.GData.Client

Original issue reported on code.google.com by andrew.f...@gmail.com on 15 Feb 2012 at 11:32

GoogleCodeExporter commented 9 years ago
Hi Andrew,

I can't reproduce this issue. Perhaps you are aware of a collection name that 
lets xml parsing fail 100% of the time?

Original comment by ccherub...@google.com on 21 Feb 2012 at 11:38

GoogleCodeExporter commented 9 years ago
Hi Claudio,

I could not reproduce it either. I tried on a separate account with the exact 
same shared collection names that our client was using.

These names are a long list of the form "AAA.partagé", "AAB.partagé" etc.

I could only reproduce the error when I had access to the client's account, and 
that is how I got that Fiddler packet capture of the error.

Perhaps it has to do with the language settings (French Canadian) that this 
client is using in Google Docs?

I had a look at fixing it myself, and it seems to be a reasonably common error 
with the MS Xml library. Stack overflow has this advice on how to fix it[1], 
whereas Microsoft advises setting the code page[2] of the encoding.

Andrew

[1]http://stackoverflow.com/questions/8275825/how-to-prevent-system-xml-xmlexcep
tion-invalid-character-in-the-given-encoding
[2] 
http://msdn.microsoft.com/en-us/library/system.text.encodinginfo.getencoding.asp
x

Original comment by andrew.f...@gmail.com on 22 Feb 2012 at 12:10

GoogleCodeExporter commented 9 years ago
Andrew, since you are the only one that can reproduce the issue and test a fix, 
would you like to send a patch for review?

Original comment by ccherub...@google.com on 22 Feb 2012 at 12:17

GoogleCodeExporter commented 9 years ago
Marking as obsolete, feel free to reopen if you can reproduce 

Original comment by ccherub...@google.com on 3 Sep 2012 at 11:49

GoogleCodeExporter commented 9 years ago
Hi Claudio,

Yes, please close it. I could never reproduce it on English settings.

Original comment by andrew.f...@gmail.com on 4 Sep 2012 at 1:37