duckdb / duckdb-odbc

ODBC Driver for DuckDB
https://duckdb.org/docs/api/odbc/overview
10 stars 4 forks source link

Windows ODBC data source always uses :memory: no matter what you set it to in Control Panel - ODBC Data Sources #29

Open shwivel opened 3 months ago

shwivel commented 3 months ago

On a Windows system after installing the ODBC driver a data source name of "DuckDB" is created with the default database set to :memory: but of course we may wish to specify a specific database file.

When you change :memory: to some file (ie. C:\some_file.db ) and you try to use that data source, you'll get "table not found" errors (if running a query) or no tables will list (if listing tables) because it's actually using the transient :memory: database even though you've specified a persistent one, as depicted below:

image

It shows up in regedit fine:

image

But it doesn't actually use the given database. For example if you open Excel or Power BI and click Data > Get Data > From other sources > From ODBC > DuckDB > no tables will be listed because none are there (it's using :memory: and not the one specified in Control Panel > ODBC data sources). You can manually set the database file location and get the tables to list, within every Excel or whatever file using the DSN, however this is not only annoying to have to do everywhere and every time, but certain features become limited when you set a manual connection string. In any case, I figure this cannot be intended, because why would you allow the specification of a database (with a default of :memory:) if no matter what you change it to, it is still going to use :memoy:?

Let me know if you have any questions. Thanks

renbud commented 3 months ago

I have this issue as well on Windows 11 with the 64 bit drivers and power BI/Excel.

Also noting that a couple of the tests are failing when I run test_odbc.exe that comes with the drivers.

D:\a\duckdb\duckdb\tools\odbc\test\tests\connect.cpp(132): FAILED: {Unknown expression after the reported line} due to unexpected exception with message: Could not find storage_version.db file.

Test SQLColAttribute for a query that returns an int

=============================================================================== test cases: 28 | 27 passed | 1 failed assertions: 45118 | 45117 passed | 1 failed

malcook commented 2 months ago

I also find this issue prevents from using window's version DuckDB ODBC Driver with a connection to database other than :memory:.

Symptoms include empty list of tables displayed using Excel > Data > Get Data > From Other Sources > From ODBC > choose my configured DuckDB connection > Navigator

image

FWIW I note this issue is surfacing elsewhere

Might this help ??? ... I too found some of the tests failing:

 .\test_odbc.exe
[6/31] (19%): Test SQLConnect and SQLDriverConnect
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test_odbc.exe is a Catch v2.13.7 host application.
Run with -? for options

-------------------------------------------------------------------------------
Test SQLConnect and SQLDriverConnect
-------------------------------------------------------------------------------
D:\a\duckdb-odbc\duckdb-odbc\test\tests\connect.cpp(132)
...............................................................................

D:\a\duckdb-odbc\duckdb-odbc\test\tests\connect.cpp(132): FAILED:
  {Unknown expression after the reported line}
due to unexpected exception with message:
  Could not find storage_version.db file.

[29/31] (93%): Test SQLColAttribute for a query that returns an interval        SQLColAttribute: Success with info
[30/31] (96%): Test SQLColAttribute for a query that returns a uuid             SQLColAttribute: Success with info
[31/31] (100%): Test SQLColAttribute for a query that returns a uuid
===============================================================================
test cases:    31 |    30 passed | 1 failed
assertions: 45345 | 45344 passed | 1 failed

and

\test_connection_odbc.exe
[0/1] (0%): Test SQLConnect with Ini File                                       SQLExecDirect (FROM string_values): Error: Error

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test_connection_odbc.exe is a Catch v2.13.7 host application.
Run with -? for options

-------------------------------------------------------------------------------
Test SQLConnect with Ini File
-------------------------------------------------------------------------------
D:\a\duckdb-odbc\duckdb-odbc\test\tests\connect_with_ini.cpp(9)
...............................................................................

D:\a\duckdb-odbc\duckdb-odbc\test\common.cpp(27): FAILED:
  REQUIRE( (((ret)&(~1))==0) )
with expansion:
  false

[1/1] (100%): Test SQLConnect with Ini File
===============================================================================
test cases: 1 | 1 failed
assertions: 6 | 5 passed | 1 failed

success here:

.\SystemDataODBC_tests.exe
[1/1] (100%): System.Data.ODBC
===============================================================================
All tests passed (26 assertions in 1 test case)

I also installed bleeding edge from https://duckdb.org/docs/installation/?version=main&environment=odbc&platform=win&download_method=direct&architecture=x86_64 and found this issue persists.

shwivel commented 2 months ago

It is clear they just messed up the setting of the database file location in ODBC data sources (regardless of what you set it to, it just uses :memory: no matter what) If they didn't want to enable users to set a database file location, there would be no reason for this field. But the field is there and it just doesn't work.

Until the team fixes this, the solution is to specify the database file in the connection string of power query, like: image

The downside is if you ever change where that file is, you'd have to update this string in every file, rather than the configuration of the DSN in ODBC data sources, which would be just one single change to one thing in one place.

Undoubtedly this is not intended. Hopefully they will fix it soon.

malcook commented 2 months ago

Follow up with a hint and possible step toward a workaround...

Excel > Data > Get Data > From Other Sources > From ODBC > choose my configured DuckDB connection allows providing "Advanced Options" which can include a an option "SQL Statement".

I put the following in as the SQL statement:

attach 'C:\path\to\my\results.duckdb';
select * from results.mytable

and I got a step further as mytable is displayed in a grid...

...however when I click on "Load" I am advised:

image

shwivel commented 2 months ago

If you edit your query and click "Advanced Editor" I believe the following would give you what you expect:

let
   Source = Odbc.Query("driver={DuckDB Driver};database=C:\path\to\my\results.duckdb;access_mode=read_only;dsn=DuckDB", "select * from mytable;")
in
    Source

It should not be necessary to specify the location of the file, of course. But right now that's what I do, under the circumstances.

malcook commented 2 months ago

@shwivel thanks for that tip - i saw it before too in https://github.com/duckdb/duckdb/issues/11380#issuecomment-2256478963 - I just don't know how to apply it to my use of Excel, as opposed to your use of Power Query. Guidance much appreciated!

shwivel commented 2 months ago

You are in the correct place. On the very first screen (depicted below) you would separate the connection string and query as depicted below:

image

Later, it is helpful for formatting purposes to use the advanced editor. (ribbon menu > Data > Queryies & Connections [this opens a side panel on the right with all your queries] then right click your query (from that side panel), click edit, and in the window that pops up, use the "Advanced Editor" button on the top left of ribbon menu. Then you'll see what I had pasted above and can more easily tweak the connection string and query. (still easier to to copy and paste from a syntax editor, but at least you get the line breaks and such) This looks like hell compared to the equivalent functionality on Google Sheets, but it is what it is and is mostly not terrible. Not as good as it could be, for sure.

malcook commented 2 months ago

Thanks @shwivel - I'm so close!

Using your suggestion, I'm now getting this error message:

image

... whether or not I qualify the table name with main..

I'm pretty confident I am connecting to my database since if I mistype the database path the error message instead refers to :memory:.

I am also confident my select statement is correct - it works when run connecting using CLI (linux side), with or without the main. prefix.

Do you possibly see any other variation I should be trying?

shwivel commented 2 months ago

If you go to the advanced editor (where the first word should be "let") you see a connection string and your query in the format depicted here, correct? https://github.com/duckdb/duckdb-odbc/issues/29#issuecomment-2353768577

Are you attaching databases (within SQL) or solely using the database you've specified in the connection string?

What happens if you just run a query like:

select 1 as x; Do you get an error (or the single row/column result)?

To confirm for certain you are connecting to the correct .db file you could also run this query:

select * from duckdb_settings() where name = 'temp_directory';

Should return a value of something like "c:\your_path\your_name.your_extension.tmp" (the db file you're connecting to, with .tmp added at the end). If you don't see that, and you see merely ".tmp" that means you've connected to :memory:

malcook commented 2 months ago

Hi!

Yes, it begins with "let".

re:

To confirm for certain you are connecting to the correct .db file you could also run this query:

I'm not sure I appreciate how this test provides any diagnositic as to whether I've connected to my database. Here I demo I'm connected to my db, but the temp_directory is still just '.tmp'.

image

The above was using CLI unix side.

I am now unsure whether I was ever attaching to my database using the query connection string in Excel.

I am quite sure I can issue an explicit attach statement and query my database with success: image

but if I comment out the attach the query fails, despite the connection string setting the database to the same path: image

So, it is pretty clear to me given the above that the connect string is not being respected. I did try quoting the path to database to no effect.

Thanks for all your suggestions.

shwivel commented 2 months ago

By default the temp file will be in the same folder as the database file (that you're connected to), and the temp file will be named the same, but with .tmp at the end. If you've connected to :memory: then the result will just be .tmp. You mention having running that test query via CLI on a linux machine, but that's not where you need to test because it is not where you're having trouble connecting. I would test out running that query in Excel (rather than the query you're actually running) to make sure that you're actually connecting to the database you want to connect to, within Excel.

Attached is an Excel file named test_excel.xlsx with a query that accesses a database named testdb.db at path C:\data\testdb.db which has one table called poo. An image below depicts the setup.

image

Excel file: test_excel.xlsx

Database file: https://drive.google.com/file/d/1MVAAxcs5-9Yz12j99QfxJfUN1kRx30DR/view?usp=sharing

If you put the database file onto your Windows machine at path C:\data\ and this Excel file will not allow you to refresh the query result, then something is going on environment-wise, because I can do so, with these two files.

malcook commented 2 months ago

Yes, thanks you so much for helping my sleuth this out. Something is indeed going on environment-wise. I'll try on another PC later that may help me resolve my issue.

I am getting the same issue trying to connect within Excel, despite everything being "in the right place", viz: image

FWIW: I connect to your testdb.db just fine using DBeaver.

malcook commented 2 months ago

FWIW: my workaround for now is to mirror the duckdb in sqlite for purposes of slapping an excel data dashboard in front of it. 😄

malcook commented 2 months ago

Alas: the issue persists for me on two other windows computers.

re: environment-wise possibilities - I retested after installing latest supported Microsoft Visual C++ Redistributable (and restarting). No fix 😢

Still hoping this issue just gets fixed upstream...