Closed mikeAdamss closed 3 years ago
Doesn't messytables require xlrd==1.2 for their excel support anyways? As far as I can see replacing the conversion from ods to xlsx wouldn't solve the ultimate dependency.
just to document our chat on this, yes but we'd build off this messytables branch: https://github.com/GSS-Cogs/messytables/commit/ed9f3ed1ab36c86f533fbb1616c974077777857e. which is now viable as the databaker table loaders (thing what returns "tabs") are no longer dependant on the quirks of that old xlrd release.
merged.
we're currently handling ods in gssutils by converting ods files to xls files then passing them in. this is what gives us the simple "dataframe" of cells without cell properties. This: https://github.com/GSS-Cogs/gss-utils/blob/6608fd45c03c5438d93b0311be3b9d5b20f3e99b/gssutils/transform/download.py#L66-L72
this also ties gssutils into dependencies we don't want it to have to worry about (i.e the old version of xlrd as used by pyexcel).
this is an mvp to do that "convert to excel" for databaker in databaker, so loading the cells without properties is fine, (that'll be a separate, much bigger pieces of work in the future) so in the event we passed as ods file just convert it to xls or xlsx and pass it into the relevant existing table loader.
Don't use any ods compliant library that relies on old versions of xlrd, that'd just recreate the original dependency problems (I would personally explore the pandas, pyexcel and more up to date verson of xlrd we're already using in databaker rather then importing more dependencies).