ethyca / fides

The Privacy Engineering & Compliance Framework
https://ethyca.com/docs
Apache License 2.0
361 stars 72 forks source link

Microsoft SQL Server database connection support for `generate` and `annotate dataset` #252

Closed iamkelllly closed 2 years ago

iamkelllly commented 3 years ago

To ease the onboarding to Fides, Fidesctl should be able to support highly adopted databases for generating dataset annotations. This will allow adopters to be able to create dataset annotations quickly and directly from the database.

This issue will include (non exclusive!):

ThomasLaPiana commented 2 years ago

side note: Fidesops is also having a discussion on how to build scalable test infra for various external connectors, so we'll be collaborating with them on the design of the test infra

PSalant726 commented 2 years ago

Currently the only MS SQL driver that is both fully supported by sqlalchemy and currently maintained is pyodbc (see the descriptions for all three options listed here for details). Unfortunately, installing pyodbc currently fails on M1 macs without first installing unixodbc via Homebrew and setting some ENV variables to point the installer to the new M1 Homebrew default installation dir (see https://github.com/mkleehammer/pyodbc/issues/846).

Given this instability and the cryptic workaround, I think we should hold off on MS SQL support until pyodbc is updated. There is one promising PR opened atm: https://github.com/mkleehammer/pyodbc/pull/870

ThomasLaPiana commented 2 years ago

@seanpreston y'all have sqlserver running over on fidesops? Did you hit this issue?

iamkelllly commented 2 years ago

@pattisdr did the research on this for ethyca/fidesops#78 and ended up with the same conclusion basically:

Ran into issues with MSSQL not yet being supported on M1 architecture, but there are suggestions to use an Azure SQL Edge image instead, although the differences between these may matter.

There are also some questions about which SQLAlchemy dialect/DBAPI to use, ran into early M1 issues with PyODBC and pymssql (issues building wheels, but maybe not worth pursuing because sqlalchemy docs say it is unmaintained).