okfn / messytables

Tools for parsing messy tabular data. This is now superseded by https://github.com/frictionlessdata/tabulator-py
http://messytables.readthedocs.io/
387 stars 110 forks source link

Add CDFV2-unknown to MIMELOOKUP #158

Closed StevenMaude closed 8 years ago

StevenMaude commented 8 years ago

For wpp.xls in xypath's test fixtures (https://github.com/sensiblecodeio/xypath), the result of detecting MIME type with file is "application/CDFV2-unknown" when using recent versions of file (tested on file 5.25), and only looking at the first 4K of the xls as messytables does. This isn't currently recognised and causes messytables to fail when autodetecting file type in get_mime().

messytables.error.ReadError: Did not recognise detected MIME type: "application/CDFV2-unknown".

(NB: if the whole file is used for detection, the MIME type with file 5.25 is "application/vnd.ms-excel" which would be allowed by messytables.)

Think the change in file is here: https://github.com/file/file/commit/4c195c2c22236b8cd169b80ef1809ab753d36b25#diff-164f7aa10d841313928ec217ed258cbfL519

StevenMaude commented 8 years ago

(All tests pass on Python 2.7 and 3.4 locally.)

frabcus commented 8 years ago

I'm hitting this while working on #157, it's simple so am merging