mindsdb / dfsql

SQL interface to Pandas
GNU General Public License v3.0
51 stars 9 forks source link

syntax documentation #18

Closed bobcolner closed 3 years ago

bobcolner commented 3 years ago

Looks cool, but I have no idea what SQL functions the tool supports..

torrmal commented 3 years ago

thank you @bobcolner,

Would you like to help us with the documentation? @btseytlin can show you the ropes of whats supported from SQL.

btseytlin commented 3 years ago

@bobcolner Thanks! Here's the currently supported list of functions:

OPERATIONS = (
    And, Or, Not,

    Equals, NotEquals, Greater, GreaterEqual, Less, LessEqual, Is, IsNot,

    Plus, Minus, Multiply, Divide, Modulo, Power,

    StringConcat, StringLower, StringUpper, Like,

    In,
)
AGGREGATE_FUNCTIONS = (
    Sum, Mean, Count, CountDistinct,
)

You can also make your own functions, similar to how you can do pd.DataFrame().apply(my_function) or pd.DataFrame().agg({column: my_function}). E.g.:

def custom(x):
    return x + '_custom_addition'

data_source.register_function('custom', custom)
sql = "SELECT custom(\"a\")"
query_result = data_source.query(sql)
assert query_result == 'a_custom_addition'

Or alternatively, you can pass custom functions to sql_query:

from dfsql import sql_query

sql_query('SELECT magic_function(column1) FROM df', df=df, custom_functions={'magic_function': lambda x: x*x})

See these tests for examples: https://github.com/mindsdb/dfsql/blob/stable/tests/test_data_sources/test_file_data_source.py#L342 https://github.com/mindsdb/dfsql/blob/stable/tests/test_data_sources/test_file_data_source.py#L712

bobcolner commented 3 years ago

thanks!