Closed austinweisgrau closed 7 months ago
This will change default behavior for loading booleans from parsons Table to a database, so implementations based on the prior default behavior will likely break. e.g.
tbl = Table([{"bool_col": True}])
Redshift.copy(tbl, "tablename")
Redshift().query("select * from tablename where bool_col = 'True'")
will no longer work, instead the more normal code will be needed:
Redshift().query("select * from tablename where bool_col ")
I think this is a great change. 100% for making things more straightforward and intuitive. A few questions, Austin.
table = table.convert(['bool', 'columns', 'here', ...], str)
We could include that code snippet in the release notes to help anyone migrate who wants to retain the old behavior.
Yeah I agree, I guess I was also being cautious about maintaining "backwards compatibility" with something I don't really understand, but 1) I agree this whole feature set and configuration don't make any sense and aren't necessary 2) since this is going into a major release, it's ok to break some minor level of compatibility
So yeah I'll make some updates to just turn this on and remove the configurabilith
Ok so digging deeper, it looks like the current implementation will accept any of the following case-insensitive python values as True: [True, "true", "t", 1, "1", "yes"]
and any of the following as False: [False, "false", "f", 0, "0", "no"]
I think this is bad and the only values we should translate into boolean are python booleans, [True, False]
If someone wants a set of strings or integers to be coerced into booleans when they're loading a parsons Table to a database, they should do that type conversion explicitly themself. It seems sketchy to have strings like "t" or "yEs" getting coerced into booleans by default.
These changes are made with 1cc6bd1e44d855cc9853f954666ffad189db9ff4
What do you think?
Ok so digging deeper, it looks like the current implementation will accept any of the following case-insensitive python values as True:
[True, "true", "t", 1, "1", "yes"]
and any of the following as False:[False, "false", "f", 0, "0", "no"]
I think this is bad and the only values we should translate into boolean are python booleans,
[True, False]
If someone wants a set of strings or integers to be coerced into booleans when they're loading a parsons Table to a database, they should do that type conversion explicitly themself. It seems sketchy to have strings like "t" or "yEs" getting coerced into booleans by default.
These changes are made with 1cc6bd1
What do you think?
I agree philosophically, we shouldn't be coercing data types implicitly so this is a valuable change. Going to take one more look at the diff but overall looks good
Commit 766cfaedc5652af8c39fde890067a99b6fa58518 created a feature for parsing boolean types but turned it off by default. This commit turns that feature on by default and adds a comment about how to turn it off and what that does.
When this feature is turned on, parsons will detect boolean column types when copying a parsons Table to a database and create a boolean column in the database. The default behavior without this change is to coerce booleans to be strings when loading to database.
Partially addresses #942