pudo / dataset

Easy-to-use data handling for SQL data stores with support for implicit table creation, bulk loading, and transactions.
https://dataset.readthedocs.org/
MIT License
4.76k stars 297 forks source link

find 0 OR None #340

Closed vladiscripts closed 3 years ago

vladiscripts commented 4 years ago

Would like to be able to query OR for a NULL value. In SQLite, boolean False is stored as "0". The ensure parameter creates columns with a default value of NULL. I would like an analogue of the query find(col = [0, None]) or find (col = {'in': (0, None)}). But this query only returns col = 0. Or have some analog of the OR operator, like find(col = 0 | col = None). Or a sequential mechanism like: find(col=0).find(col=None) separated by dot.

Therefore, I have to write a lot of unnecessary code, first query for rows with the value "0", then with the NULL value, then combine their results. A problem arises when I need to check several such columns, like "column1 = 0 OR NULL, column2 = 0 OR NULL" etc.

medecau commented 3 years ago

NULL does not mean boolean False - NULL means that no value was set

if the default for this column is False then it is best to set it as False

for row in t.find(col=None):
    row['col'] = False
    t.upsert(row, ['id'])

that is what you should do

if you don't want to change the values in the database you can chain iterators together

from itertools import chain
falsy = chain(t.find(col=False), t.find(col=None))

please note

  1. find works by passing in argument values to be matched in the query - Python arguments are passed by assignment - it doesn't make sense to do logic operations on assignment operations

  2. chaining calls to the find method is confusing

    • do chained calls to find compound using and or or operators?
    • do they operate at the same level of the first call or on subqueries?
  3. compounding values in a list would require medeling with how dataset interacts with SQLAlchemy and making that interaction more complex

don't do this - opt for the first suggestion

dataset exposes SQLAlchemy objects - you could take advantage of this to create the query you want

t.find(t.table.c.col.is_(False) | t.table.c.col.is_(None))
pudo commented 3 years ago

You can probably also do something a la:

t.find(('col', 'in', (None, False, 0)))

No warranty, might backfire :/

vladiscripts commented 3 years ago

It looks like it works like this, where "another_field" can be one of the additional kwargs filters:

col = t.table.c.col_name
result = list(t.find(col.is_(None) | col.in_([0, False]), another_field_clause=5))