csingley / ofxtools

Python OFX Library
Other
301 stars 68 forks source link

str vs bytes issue in download #44

Closed euresti closed 6 years ago

euresti commented 6 years ago

a1837d1007179c14fd309606b94e7d00459de7fc changed: data += ET.tostring(tree, encoding='unicode') to data += ET.tostring(tree, encoding='utf-8') but I argue this is incorrect since data is a str and we need the tostring to return a str, but with the encoding in Python 3 it will return bytes

Anyway before making a diff to put it back I wanted to check with @fergbrain whether he had seen an issue.

fergbrain commented 6 years ago

Yes, I did have issue with Python2...which is why I proposed the change in the first place. My understanding from the docs is that “unicode” is not a valid type for encoding because it’s not on the list shown in footnote 1 (“The encoding string included in XML output should conform to the appropriate standards. For example, “UTF-8” is valid, but “UTF8” is not. See https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl and https://www.iana.org/assignments/character-sets/character-sets.xhtml.”)

csingley commented 6 years ago

My edit did this:

Use encoding="unicode" to generate a Unicode string (otherwise, a bytestring is generated).

But this is not the behavior in Python 2. @ferbrain's edit to fix Python 2 breaks Python 3.

I should have known better than to touch this after @rianhunter fixed it; this should all be reverted to how it was before.

euresti commented 6 years ago

Ah. Maybe throw a if PYTHON_VERSION == 3: and do unicode in one path and utf-8 in the other.

csingley commented 6 years ago

Should be fixed in 40b4a21. @fergbrain if you could verify correct behaviour under py2k, thanksverymuch.

fergbrain commented 6 years ago

Confirmed it works with Python 2.7.15

csingley commented 6 years ago

Word then I'mma mark this fixed