larsyencken / csvdiff

Generate a diff between two tabular datasets expressed in CSV files.
BSD 3-Clause "New" or "Revised" License
132 stars 31 forks source link

compare CSV(with BOM format), and use ignore_columns for first column will prompt error sequence[key].pop(i) #22

Closed PpAaLlMmEeRr closed 7 years ago

PpAaLlMmEeRr commented 7 years ago

error message: Traceback (most recent call last): File "C:\Python27\Scripts\csvdiff-script.py", line 9, in load_entry_point('csvdiff==0.3.1', 'console_scripts', 'csvdiff')() File "C:\Python27\lib\site-packages\click\core.py", line 716, in call return self.main(args, kwargs) File "C:\Python27\lib\site-packages\click\core.py", line 696, in main rv = self.invoke(ctx) File "C:\Python27\lib\site-packages\click\core.py", line 889, in invoke return ctx.invoke(self.callback, ctx.params) File "C:\Python27\lib\site-packages\click\core.py", line 534, in invoke return callback(args, **kwargs) File "C:\Python27\lib\site-packages\csvdiff-0.3.1-py2.7.egg\csvdiff__init__.py", line 151, in csvdiff_cmd sep=sep, ignored_columns=ignore_columns) File "C:\Python27\lib\site-packages\csvdiff-0.3.1-py2.7.egg\csvdiff__init__.py", line 181, in _diff_and_summarize diff = patch.create(from_records, to_records, index_columns, ignored_columns) File "C:\Python27\lib\site-packages\csvdiff-0.3.1-py2.7.egg\csvdiff\patch.py", line 208, in create from_indexed = records.filter_ignored(from_indexed, ignore_columns) File "C:\Python27\lib\site-packages\csvdiff-0.3.1-py2.7.egg\csvdiff\records.py", line 52, in filter_ignored sequence[key].pop(i) KeyError: u'aa'

error prompt from sequence[key].pop(i), need some operation for BOM CSV compare as following sequence[key].pop(i.encode('utf-8-sig'))

larsyencken commented 7 years ago

I'm using Python's native csv module and native UTF-8 handling. Whilst we could allow CSV files with this kind of encoding, can I recommend instead that you just convert them to normal UTF-8 without BOM before doing a diff?

PpAaLlMmEeRr commented 7 years ago

covert is a good idea, thanks for reply!