blaze / datashape

Language defining a data description protocol
BSD 2-Clause "Simplified" License
183 stars 65 forks source link

BUG: bool data type not converting, when it should #217

Closed thequackdaddy closed 7 years ago

thequackdaddy commented 8 years ago

Hello,

This resolves (for me at least) #208.

I have a MSSQL table with a bit data column. In odo, we are converting that to bool, which is fine. However, bool types throw an error here. I'm not an expert on what is supposed to be going on here, but this gets it to work for me.

Thanks.

llllllllll commented 7 years ago

The issue here is that numpy does not have support for Option(bool) so we cannot convert that type. If your dataset has NULL, numpy cannot store that in a bool field.

thequackdaddy commented 7 years ago

@llllllllll Ahhh...

Should I close this then? Or is there a better way?

Really the usecase is that I have a MSSQL database that I read from with BIT fields. Its probably safe assuming that that when NULL implies False and I'd like odo/blaze to assume as much, even with a warning.

Would the blaze ecosystem be OK with a patch that automatically assumed this, but maybe throw an error? Maybe this is an odo behavior?

thequackdaddy commented 7 years ago

Actually I may have missed something, but this seems to work now when I have odo=0.5.1. It coerces bits to False when they are NULL which is fine by me.

Closing, and thanks to whoever/whatever fixed this.

llllllllll commented 7 years ago

Weird, I know there was some work in in that edge but I didn't think the mssql stuff got updated. I am not totally sure I support the current behavior, but only because it makes it easy to not realize that you actually had a bunch of NULLs in your data. I do agree that if you needed to pick, False is a better default, but it is not hard to make it explicit. With blaze you could do: bz.transform(tbl, col=tbl.col.coalesce(False)) which would always do the right thing and be explicit about any conversions.

I don't feel so strongly that I want to go change this right now but I don't know if this was an intended change. Right now the MSSql backend is not run with automated testing so it is hard to make sure the behavior stays the same.

thequackdaddy commented 7 years ago

@llllllllll I'm happy to attempt to write some testing code, if someone can give hints. Its not obvious to me how to test MSSQL...

I'd appreciate some clues.

llllllllll commented 7 years ago

We can probably use mostly the same set of tests for postgres or sqlite. What I don't know is how we would get an database available for travis to use.

thequackdaddy commented 7 years ago

@llllllllll I'd guess the answer w.r.t. travis is "no". From the brief amount of experience I have trying to get a *nix client to talk to MSSQL, its an absolute nightmare.

With that said, appveyor appears to support it, but the question would be would the odo team want to add appveyor testing? I'm aware that appveyor can be very messy...

llllllllll commented 7 years ago

I have worked with appveyor before and don't find it to be too much of a hassle. I am not sure who has access to enable the github hooks for appveyor for odo but if someone wanted to write an appveyor file I wouldn't mind adding it. I will open an issue on odo and xref this one.