wireservice / csvkit

A suite of utilities for converting to and working with CSV, the king of tabular file formats.
https://csvkit.readthedocs.io
MIT License
6.03k stars 603 forks source link

in2csv --date-format not handling xls DD-MMM date formatted column #1190

Closed johnandrea closed 1 year ago

johnandrea commented 1 year ago

Column formatted with code DD-MMM with values such as 30-Dec remain unformatted via in2csv which outputs a single number such as 38820.0

Example: https://mi.lincolnshiremarriages.org.uk/bostonRD.xls

jpmckinney commented 1 year ago

For whatever reason, that XLS file encodes all cells as XL_CELL_TEXT instead of the date cells as XL_CELL_DATE. csvkit (more specifically agate-excel) only parses dates from XL_CELL_DATE.

So, there's unfortunately not much for csvkit to do here, because attempting to convert all possible cells to dates just in case they are intended to be dates is very expensive and also probably not what most people want.

You could try re-saving the XLS file after setting the cell type.