Closed metaodi closed 7 years ago
I just got bitten by the same bug here. To add details:
This is happening when type guessing using the DataUtilType
on an xls and is caused by this code https://github.com/okfn/messytables/blob/master/messytables/types.py#L188
def test(self, value):
if len(value) == 1:
return False
return CellType.test(self, value)
What seems to be happening is the value at this point is datetime.datetime (i assume because the xls lib has already parsed to datetime?)
Having looked at this I'm wondering if the better solution is to have a similar test to the DateType above it in the file:
if isinstance(value, string_types) and not is_date(value):
@rufuspollock looks good. Could you release this as 0.15.2?
@metaodi i don't have push rights to pypi atm since i lost access to my account email (long story). Needs to be some other folks @okfn ...
I created new a topic in discourse, let's see: https://discuss.okfn.org/t/create-new-release-of-messytables-on-pypi/4608
@rufuspollock: I don't generally jump in here too much, e.g. on deciding whether PRs should be merged, because it's not really our project to decide, but I can release a new version via scraperwiki PyPI account, if everyone's happy with that.
I've created a PR for this. If you're fine with it, merge, tag it as 0.15.2 and I'll later release a version at that commit to PyPI.
@StevenMaude thanks for the offer though I am certain other okfn folks have ability to publish here (I've almost never been the one who published this one).
@pwalsh @roll can you push this to pypi - with a bump in version?
@rufuspollock the scraper wiki team are the maintainers on this repo. No one currently at Open Knowledge has access to this on pypi. The most recent discussion on this with myself, @pudo and @StevenMaude confirmed this, as scraperwiki seem to be the main consumer of the package (as well as the CKAN codebase).
@StevenMaude as we discussed by mail a while back, you and your team have all needed rights on this repo. And, as the scraperwiki account has rights on pypi too, please feel free to go ahead and release a version.
For anyone else following, tabulator and even possibly goodtables, are in many ways the successors to messytables, with a sharper focus, and have been built for Python 3.x (with full Python 2.7.x support of course).
@pwalsh Done, 0.15.2 on PyPI now.
Since our current work on databaker, which uses messytables, is soon coming to a close, we won't be actively working on messytables, but don't mind keeping half an eye on things to make sure that e.g. new releases get pushed out.
@pwalsh - great and thanks for the clarification 😄 - was not aware of that change.
@StevenMaude great to have this live 👍
The problematic row in question has a NaN. remove NaN, it will work.
I could trace this back to #141 where
len()
is being used in thetest()
method ofDateUtilType
.I think there should be a try/except block around that, that catches this
TypeError
. But I'm not too familiar with the code, so I'm basically asking if you agree, or if I'm missing something.I'm happy to provide the PR.
BTW: I'm getting this error via datapusher on some Excel sheet that is being parsed with the default parameters. The excel sheet has indeed a lot of float values in it.