Kanaries / pygwalker

PyGWalker: Turn your pandas dataframe into an interactive UI for visual analysis
https://kanaries.net/pygwalker
Apache License 2.0
13.45k stars 703 forks source link

PyGWalker Dataset Connector - Microsoft SQL #629

Closed AnbazhaganEXL closed 2 months ago

AnbazhaganEXL commented 2 months ago

Describe the bug While trying to use the Microsoft SQL server as backend dataset to connect since we have large set of data needs to be used. We are getting the error "ValueError: Unknown dialect 'mssql'. Did you mean mysql?"

 conn = Connector(
    f"mssql+pyodbc://{server}/{database}?driver=ODBC+Driver+17+for+SQL+Server",
    "SELECT * FROM SAM1"
    ) 

To Reproduce Steps to reproduce the behavior:

Add the mssql+pyodbc connection to reproduce the issue

ValueError: Unknown dialect 'mssql'. Did you mean mysql?
Traceback:
File "C:\Users\parasa1\AppData\Roaming\Python\Python310\site-packages\streamlit\runtime\scriptrunner\exec_code.py", line 88, in exec_func_with_error_handling
    result = func()
File "C:\Users\parasa1\AppData\Roaming\Python\Python310\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 590, in code_to_exec
    exec(code, module.__dict__)
File "\\ncfile05\Vantage_DEV_CN\Anbazhagan\POC\PygWalkar\my-app\Stream\anbu.py", line 59, in <module>
    renderer = get_pyg_sql_renderer()
File "\\ncfile05\Vantage_DEV_CN\Anbazhagan\POC\PygWalkar\my-app\Stream\anbu.py", line 56, in get_pyg_sql_renderer
    return StreamlitRenderer(conn,  spec_io_mode="rw", kernel_computation=True)
File "C:\Users\parasa1\AppData\Roaming\Python\Python310\site-packages\pygwalker\api\streamlit.py", line 97, in __init__
    self.walker = PygWalker(
File "C:\Users\parasa1\AppData\Roaming\Python\Python310\site-packages\pygwalker\api\pygwalker.py", line 74, in __init__
    self.data_parser = self._get_data_parser(
File "C:\Users\parasa1\AppData\Roaming\Python\Python310\site-packages\pygwalker\api\pygwalker.py", line 128, in _get_data_parser
    data_parser = get_parser(
File "C:\Users\parasa1\AppData\Roaming\Python\Python310\site-packages\pygwalker\services\data_parsers.py", line 77, in get_parser
    parser = parser_func(
File "C:\Users\parasa1\AppData\Roaming\Python\Python310\site-packages\pygwalker\data_parsers\database_parser.py", line 121, in __init__
    self.example_pandas_df = self._get_example_pandas_df()
File "C:\Users\parasa1\AppData\Roaming\Python\Python310\site-packages\pygwalker\data_parsers\database_parser.py", line 128, in _get_example_pandas_df
    sql = self._format_sql(f"SELECT * FROM {self.placeholder_table_name} LIMIT 1000")
File "C:\Users\parasa1\AppData\Roaming\Python\Python310\site-packages\pygwalker\data_parsers\database_parser.py", line 143, in _format_sql
    this=sqlglot.parse(self.conn.view_sql, read=sqlglot_dialect_name)[0],
File "C:\Users\parasa1\AppData\Roaming\Python\Python310\site-packages\sqlglot\__init__.py", line 99, in parse
    return Dialect.get_or_raise(read or dialect).parse(sql, **opts)
File "C:\Users\parasa1\AppData\Roaming\Python\Python310\site-packages\sqlglot\dialects\dialect.py", line 753, in get_or_raise
    raise ValueError(f"Unknown dialect '{dialect_name}'.{similar}")

Expected behavior since the mssql+pyodbc works with SQLAlchemy , not sure why it's not working.

Screenshots If applicable, add screenshots to help explain your problem.

Versions

Additional context I tried to fix the issue by modifying the code in DatabaseDataParser (data_parsers\database_parser.py) class by adding the mssql mapping.

class DatabaseDataParser(BaseDataParser):
    """data parser for database"""
    sqlglot_dialect_map = {
        "postgresql": "postgres"#,
        "mssql": "tsql"
    }

after that change the above code Unknown dialect 'mssql' issue went away and page also loading with list of columns, but when we drag and drop the any coulum for aggregation we are getting the error "str' object has no attribute 'args'"

{"code": -1, "data": {"queryList": [{"workflow": [{"type": "transform", "transform": [{"key": "gw_count_fid", "expression": {"op": "one", "params": [], "as": "gw_count_fid"}}]}, {"type": "view", "query": [{"op": "aggregate", "groupBy": [], "measures": [{"field": "gw_count_fid", "agg": "sum", "asFieldKey": "gw_count_fid_sum"}]}]}], "limit": 50000}]}, "message": "'str' object has no attribute 'args'"}

longxiaofei commented 2 months ago

Hi, @AnbazhaganEXL.

Thanks for your feedback.

This may be due to dialect conversion failure of tsql, I will try to reproduce it and fix it.

AnbazhaganEXL commented 2 months ago

Hello @longxiaofei ,

Yes your correct, I've handled the same using the sqlglot transpile option for converting the sql. could you please review the below solutions. it works fine after making the change (data_parsers\database_parser.py) .

Function Name : def _format_sql(self, sql: str) -> str:
          try:             
            sql = ast.sql(sqlglot_dialect_name)
        except Exception as e:
            sql = sqlglot.transpile(str(ast), read='duckdb', write=sqlglot_dialect_name)[0]
longxiaofei commented 2 months ago

@AnbazhaganEXL

Maybe we can solve this bug from the root cause, https://github.com/Kanaries/pygwalker/pull/632.

I've fixed it.

Can you submit a PR about tsql in sqlglot_dialect_map?

Thanks in advance for your contribution.

AnbazhaganEXL commented 2 months ago

I've submitted the PR - https://github.com/AnbazhaganEXL/pygwalker/pull/1

longxiaofei commented 2 months ago

Hi @AnbazhaganEXL , You can continue to submit a PR to kanaries/pygwalker, I will review and merge it.

AnbazhaganEXL commented 2 months ago

Sure, will do that.