amphi-ai / amphi-etl

Visual Data Transformation with Python Code Generation. Low-Code Python-based ETL.
https://amphi.ai
Other
924 stars 44 forks source link

More generic connections available for database (jdbc, odbc, oledb, sql alchemy, adbc...) #14

Open simonaubertbd opened 6 months ago

simonaubertbd commented 6 months ago

Hello,

As of today, connections to data are limited for databases. I think you would impact waaaaay more users if you leverage standards connectors (jdbc, odbc, oledb, sql alchemy, adbc...) and only after make some dedicated connectors to major databases.

Best regards,

Simon

tgourdel commented 6 months ago

Thank you Simon, that's very helpful feedback! I'll take a look

tgourdel commented 6 months ago

I'll add odbc soon, adbc is still experimental or beta but is very interesting for performance purposes, thanks for the pointers.

simonaubertbd commented 6 months ago

Hello @tgourdel Thanks. For ADBC, yes, it's very young and promising. However, from a strategic perspective, it could be a competitive advantage that can help amphi recognition as probably the only etl tool with it.

Best regards,

Simon

simonaubertbd commented 2 months ago

Hello @tgourdel I think you can leverage pyodbc for generic odbc connections. (https://pypi.org/project/pyodbc/ ) Some code I have used to connect python to monetdb `#import library import pyodbc

test there is DSN (not mandatory)

pyodbc.dataSources()

connect to a DSN even it the db is officially not supported. In Amphi, we should have the choice between DSN and connection string

cnxn = pyodbc.connect('DSN=MONETDB_SAU;SCHEMA=demo', UID='monetdb', PWD='**')

write the query

num_order = cnxn.execute("select toto,count(*) from formation.test_alteryx group by toto") records = num_order.fetchall() for r in records: print (r)

and for generic adbc, maybe https://pypi.org/project/adbc-driver-manager/#description but this definetely requires more work to understand the concepts.

tgourdel commented 2 months ago

Thanks for testing Simon, I've released a first iteration of an ODBC Input. So from what I understood, everything can go into the connection string. https://github.com/mkleehammer/pyodbc/wiki/Connecting-to-databases So if I'm not mistaken, when using DSN, you can add it in the connection string. The components probably needs some improvement later on.

simonaubertbd commented 2 months ago

@tgourdel Thanks. Basically, it works ! image

The things you can improve here is to make it easier, like examples of how to use it with a DSN or a real connection string. And maybe a way to construct the connection string with the DSN, by retrieving it from the DSN list.

But yes, the feature is here and for me, it's one of the biggest change ever for Amphi !

Best regards,

Simon