IATI / CSV2IATI

[DEPRECATED ] - A tool to convert CSVs to IATI XML
Other
5 stars 4 forks source link

UTF-16 bug #181

Closed Bjwebb closed 2 years ago

Bjwebb commented 10 years ago

Looks like the CSV converter is not handling UTF-16 properly:

[Tue Jan 21 21:41:55 2014] [error] Traceback (most recent call last):
[Tue Jan 21 21:41:55 2014] [error]   File "/home/pythonuser/CSV-IATI-Converter.modeleditor/pyenv/lib/python2.7/site-packages/flask/app.py", line 1817, in wsgi_app
[Tue Jan 21 21:41:55 2014] [error]     response = self.full_dispatch_request()
[Tue Jan 21 21:41:55 2014] [error]   File "/home/pythonuser/CSV-IATI-Converter.modeleditor/pyenv/lib/python2.7/site-packages/flask/app.py", line 1477, in full_dispatch_request
[Tue Jan 21 21:41:55 2014] [error]     rv = self.handle_user_exception(e)
[Tue Jan 21 21:41:55 2014] [error]   File "/home/pythonuser/CSV-IATI-Converter.modeleditor/pyenv/lib/python2.7/site-packages/flask/app.py", line 1381, in handle_user_exception
[Tue Jan 21 21:41:55 2014] [error]     reraise(exc_type, exc_value, tb)
[Tue Jan 21 21:41:55 2014] [error]   File "/home/pythonuser/CSV-IATI-Converter.modeleditor/pyenv/lib/python2.7/site-packages/flask/app.py", line 1475, in full_dispatch_request
[Tue Jan 21 21:41:55 2014] [error]     rv = self.dispatch_request()
[Tue Jan 21 21:41:55 2014] [error]   File "/home/pythonuser/CSV-IATI-Converter.modeleditor/pyenv/lib/python2.7/site-packages/flask/app.py", line 1461, in dispatch_request
[Tue Jan 21 21:41:55 2014] [error]     return self.view_functions[rule.endpoint](**req.view_args)
[Tue Jan 21 21:41:55 2014] [error]   File "/home/pythonuser/CSV-IATI-Converter.modeleditor/csviatimodeleditor/__init__.py", line 310, in model_change_csv
[Tue Jan 21 21:41:55 2014] [error]     columnnames = the_csv.fieldnames
[Tue Jan 21 21:41:55 2014] [error]   File "/usr/lib/python2.7/csv.py", line 90, in fieldnames
[Tue Jan 21 21:41:55 2014] [error]     self._fieldnames = self.reader.next()
[Tue Jan 21 21:41:55 2014] [error] Error: line contains NULL byte
Bjwebb commented 10 years ago

Seems the python CSV model has problems with NULLs that make it impossible to use for UTF-16 encoded files:

http://docs.python.org/2/library/csv.html

The csv module doesn’t directly support reading and writing Unicode, but it is 8-bit-clean save for some problems with ASCII NUL characters. So you can write functions or classes that handle the encoding and decoding for you as long as you avoid encodings like UTF-16 that use NULs. UTF-8 is recommended.

Bjwebb commented 10 years ago

A fix might be to use https://pypi.python.org/pypi/unicodecsv/