michael-adler / sync-google-contacts

Automatically exported from code.google.com/p/sync-google-contacts
18 stars 12 forks source link

cElementTree.ParseError: unbound prefix: line 667, column 2839 #5

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. running my initial sync, using --debug flag

What is the expected output? What do you see instead?
I expect syncing, and I get the following output:

$ ./contacts-sync --user=one --user=two --debug
Traceback (most recent call last):
  File "./contacts-sync", line 769, in <module>
    main()
  File "./contacts-sync", line 750, in main
    contacts.append(UserContacts(users[i]))
  File "./contacts-sync", line 104, in __init__
    feed = self.gd_client.GetContacts(q=query)
  File "/usr/local/lib/python2.7/dist-packages/gdata/contacts/client.py", line 201, in get_contacts
    desired_class=desired_class, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/gdata/client.py", line 640, in get_feed
    **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/gdata/client.py", line 278, in request
    version=get_xml_version(self.api_version))
  File "/usr/local/lib/python2.7/dist-packages/atom/core.py", line 520, in parse
    tree = ElementTree.fromstring(xml_string)
  File "<string>", line 106, in XML
cElementTree.ParseError: unbound prefix: line 667, column 2839

What version of the product are you using? On what operating system?
version 0.1 on Linux reason 3.0.0-17-generic #30-Ubuntu SMP Thu Mar 8 20:45:39 
UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

Please provide any additional information below.
Great utility!  I'm excited to get it working.

Original issue reported on code.google.com by blakelar...@gmail.com on 3 Apr 2012 at 1:33

GoogleCodeExporter commented 9 years ago
The failure is internal to Google's Python gdata client and the XML parsing of 
the set of contacts.  Line 104 of contacts-sync is requesting that up to 10,000 
contacts from an account be loaded into a data structure and the failure is 
inside that call.  I do see that core.py has a search path for importing 
ElementTree.  I don't know the details of XML parsing -- the failure could be 
in one implementation of ElementTree and not another.  At least in my setup, 
the version I get is from the try statement on core.py line 26.

It is also possible that there is some piece of data in a contact in one of 
your accounts in a format not expected by the gdata code.  I've seen that 
before.

Sorry I can't be more helpful.  The problem appears either to be specific to 
your setup or data.

Original comment by mad...@tapil.com on 3 Apr 2012 at 8:20

GoogleCodeExporter commented 9 years ago
Well, I found the "bad" contact (one out of 1250 of B's contacts).  Then I did 
a dry-run and everything looked good.  The same bad contact from A (the feed of 
which is read just fine) synced to B and caused the same error.  I had to 
delete the contact from B (it synced before crashing) and also from A.

THEN, it happened to a few more contacts the same way.  I had to wait until it 
crashed to find out which one needed to be deleted / recreated.  Luckily, with 
your program, I could see which contact was the offending one -- it crashes 
after the entry is read for user A and copied to user B.  Otherwise, you can't 
tell which one it is programmatically (at least as far as I can tell).

Original comment by blakelar...@gmail.com on 9 Apr 2012 at 6:03

GoogleCodeExporter commented 9 years ago
It turns out most of the bad contacts came about from an import.  Let me 
explain for the future:

My wife had her contacts transferred from her SIM card onto her new iPhone.  
Wanting to use Gmail contacts via Exchange Active Sync, I exported the contacts 
on her phone to a CSV and imported them into Gmail.  It seems that over half of 
these contacts had the syncing problem I described:  When syncing with my 
gmail, they would be added to my account and then crash.  Any subsequent Google 
API access to my account resulted in that crash.  So, if you're having this 
problem:

1. Always keep track of that first failed sync and which contact it failed on.  
You can't tell which contact that is later, and so you have to use trial and 
error.
2. Trial and error:  put contacts into a group and query that group.  If it 
doesn't crash, that group is OK.  If it does, then one of those contacts is 
bad.  Repeat with a subgroup.
3. If you find that a group of contacts may be the problem, then export those 
contacts in Gmail to a Google CSV and re-import.  That fixed it all for me.

In summary, this is a problem with the Google Contacts API.  

Original comment by blakelar...@gmail.com on 11 Apr 2012 at 2:53

GoogleCodeExporter commented 9 years ago
You can work around this by modifying the python gdata code to handle things 
better.  In your gdata/client.py file (location will depend on your system, 
mine is /usr/share/pyshared/gdata/contacts/client.py) around line 275 change
    return atom.core.parse(response.read(), desired_class, version=get_xml_version(self.api_version))
to
         xml_string = response.read()
         try:
            return atom.core.parse(xml_string, desired_class, version=get_xml_version(self.api_version))
         except:
            import re
            try:
               regex = r"""(<gd:extendedProperty [^>]*name='GCon'[^>]*><)(?:ns0:)(cc>0</)(?:ns0:)(cc></gd:extendedProperty>)"""
               xml_string = re.sub(regex, '\1\2\3', xml_string)
               xml_string = re.sub(chr(1) + chr(2) + chr(3), '', xml_string)
               return atom.core.parse(xml_string, desired_class, version=get_xml_version(self.api_version))
            except:
               f = open('/tmp/err', 'wb')
               f.write(xml_string)
               f.close()
               raise

Original comment by daibu...@gmail.com on 6 Dec 2012 at 9:56

GoogleCodeExporter commented 9 years ago
Bug is/was in library code.  client.py has changed since comment #4.  I don't 
know whether the bug was fixed.

Original comment by mad...@tapil.com on 25 Dec 2014 at 2:52