sodadata / soda-sql

Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.html
https://docs.soda.io/
Apache License 2.0
59 stars 16 forks source link

soda analyze fails on mssql-databases with case-sensitive collation #163

Closed geertvanzoest closed 2 years ago

geertvanzoest commented 2 years ago

Problem When running soda analyze against a MSSQL-database with a case-sensitive collation, Soda returns [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Invalid object name 'information_schema.tables'..

To Reproduce Steps to reproduce the behavior:

  1. Have a Microsoft SQL Server instance with target database's collation set to a case-sensitive one (for example SQL_Latin1_General_CP1_CS_AS)
  2. Configure warehouse.yml accordingly
  3. Run soda analyze

Cause This happens as Soda queries SELECT TABLE_NAME FROM information_schema.tables WHERE lower(table_schema)='dbo' during analysis while information_schema-objects in SQL Server are always uppercased (i.e. INFORMATION_SCHEMA.TABLES). In source file .\soda-sql\packages\sqlserver\sodasql\dialects\sqlserver_dialect.py you will find the conflicting SQL-statements at lines 91, 126 and 127.

Suggested solution Uppercase all information_schema-objects at mentioned code lines in sqlserver_dialect.py.

OS: All Python Version: All Soda SQL Version: Soda CLI version 2.1.0b22 Warehouse Type: Microsoft SQL Server (all versions), database collation SQL_Latin1_General_CP1_CS_AS

geertvanzoest commented 2 years ago

Example of how objects in INFORMATION_SCHEMA are cased in MS SQL Server.

image

vijaykiran commented 2 years ago

Changed the query to use capitals when querying information schema.