Open robnardo opened 9 years ago
Hello and thanks for reporting this issue. I wasn't enable to reproduce it yet, but I've enabled the automatic tests to run for Python 3.4 as well. It would help a lot if you could tell me more about how you hit this issue. Can you maybe post the filename, or parts of it, so I can try to write a test which fails?
I am having this issue with the following code:
# etree is of type <class 'xml.etree.ElementTree.Element'>
class Page:
def __init__(self, etree):
self.etree = etree
self.untangled = untangle.parse(ET.tostring(etree))
Traceback:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-26-05be7dcdfcdc> in <module>()
21
22 for child in root:
---> 23 print(parse_to_obj(child))
<ipython-input-26-05be7dcdfcdc> in parse_to_obj(etree)
9 return File(etree)
10 else:
---> 11 return Page(etree)
12
13 class Page:
<ipython-input-26-05be7dcdfcdc> in __init__(self, etree)
14 def __init__(self, etree):
15 self.etree = etree
---> 16 self.untangled = untangle.parse(ET.tostring(etree))
17
18 class File:
/Users/mplewis/.pyenv/versions/3.5.0/lib/python3.5/site-packages/untangle.py in parse(filename)
138 sax_handler = Handler()
139 parser.setContentHandler(sax_handler)
--> 140 if os.path.exists(filename) or is_url(filename):
141 parser.parse(filename)
142 else:
/Users/mplewis/.pyenv/versions/3.5.0/lib/python3.5/site-packages/untangle.py in is_url(string)
147
148 def is_url(string):
--> 149 return string.startswith('http://') or string.startswith('https://')
150
151 # vim: set expandtab ts=4 sw=4:
TypeError: startswith first arg must be bytes or a tuple of bytes, not str
I'll try to have a look at this. @mplewis could you also maybe post the xml you're parsing against?
I've needed to do the same as @robnardo , in my case I do something like:
a=requests.get("http://whatever_returns_an_xml/")
b=untangle.parse(a.text)
The XML returned contains sometimes unicode like Francés
and without editing anything it explodes on cannot encode unicode crap.
If I do untangle.parse(a.text.encode('UTF-8'))
it will explodes like:
File "/usr/local/lib/python3.4/dist-packages/untangle.py", line 149, in is_url
return string.startswith('http://') or string.startswith('https://')
TypeError: <flask_script.commands.Command object at 0x7f70316c34e0>: startswith first arg must be bytes or a tuple of bytes, not str
So using robnardo's edit it works as expected.
ps: I use requests and not untangle's one as I need to edit some headers before sending the request
Can you test this again with the newly released version 1.1.1 ?
just wanted to state I had the same issue under Python 3.5 (python 2.7 worked ok) Doing the same changes as @robnardo fixed the issue for me too
I added one more test in #89 but wasn't able to reproduce this. Would appreciate a concrete failing test.
Hi, i am using your library and receiving some errors when trying to run it using Python 3.4.0. I recently started working with python (so not an expert), but i was able to fix it for my needs by editing untagle.py on lines 143 and 149 and it worked.
So I changed line 143 to
parser.parse(StringIO(filename.decode('utf-8')))
and line 149 toreturn string.startswith(b'http://') or string.startswith(b'https://')