ibis-project / ibis

the portable Python dataframe library
https://ibis-project.org
Apache License 2.0
5.29k stars 595 forks source link

docs: connection in "DuckDB with Ibis" shows no tables #6652

Closed mvashishtha closed 1 year ago

mvashishtha commented 1 year ago

I can't finish the tutorial here because duckdb doesn't list any tables.

Steps to reproduce the error

  1. install ibis and start ipython

    conda create --name=ibis python=3.8
    conda activate ibis
    pip install 'ibis-framework[duckdb]'
    pip install ipython
    # without this I got "module 'cloudpickle' has no attribute 'dumps'"
    pip install cloudpickle
    ipython
  2. run commands from tutorial

    import ibis
    ibis.options.interactive = True # Use eager evaluation. Use only for demo purposes!
    
    connection = ibis.duckdb.connect(':memory:') # Use an In Memory DuckDB
    # connection = ibis.duckdb.connect('/path/to/my_db.db') # Use or create a physical DuckDB at this path
    
    print(connection.list_tables())
  3. get an empty list of tables

System information

pip freeze ``` appnope==0.1.3 asttokens==2.2.1 atpublic==3.1.2 backcall==0.2.0 bidict==0.22.1 certifi==2023.5.7 charset-normalizer==3.2.0 cloudpickle==2.2.1 decorator==5.1.1 duckdb==0.8.1 duckdb-engine==0.9.1 executing==1.2.0 greenlet==2.0.2 ibis-framework==5.1.0 idna==3.4 importlib-resources==5.13.0 ipython==8.12.2 jedi==0.18.2 markdown-it-py==3.0.0 matplotlib-inline==0.1.6 mdurl==0.1.2 multipledispatch==0.6.0 numpy==1.24.4 packaging==23.1 pandas==2.0.3 parso==0.8.3 parsy==2.1 pexpect==4.8.0 pickleshare==0.7.5 platformdirs==3.9.1 pooch==1.7.0 prompt-toolkit==3.0.39 ptyprocess==0.7.0 pure-eval==0.2.2 pyarrow==11.0.0 Pygments==2.15.1 python-dateutil==2.8.2 pytz==2023.3 requests==2.31.0 rich==13.4.2 six==1.16.0 SQLAlchemy==2.0.19 sqlalchemy-views==0.3.2 sqlglot==11.7.1 stack-data==0.6.2 toolz==0.12.0 tqdm==4.65.0 traitlets==5.9.0 typing_extensions==4.7.1 tzdata==2023.3 urllib3==2.0.3 wcwidth==0.2.6 xxhash==3.2.0 zipp==3.16.2 ```
cpcloud commented 1 year ago

Hi @mvashishtha, thanks for the issue!

pip install ibis installs the other ibis library. Is that intentional, or did you mean to write pip install ibis-framework[duckdb] on line 4?

cpcloud commented 1 year ago

I would also recommend using at least Python 3.9 so you can take advantage of the latest version of ibis.

mvashishtha commented 1 year ago

Is that intentional, or did you mean to write pip install ibis-framework[duckdb] on line 4?

No, I installed ibis-framework and ipython. Fixed

mvashishtha commented 1 year ago

I get the same error with Python 3.9.17 and ibis-framework 6.0.0

lostmygithubaccount commented 1 year ago

hi @mvashishtha, I believe this may be a bug in the DuckDB documentation. this is not an error in that you are connecting to an empty in-memory database and listing the tables, which is an empty list. this is the same behavior seen in the duckdb CLI:

(venv) cody@dkdcvd ibis % duckdb
v0.8.1 6536a77232
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
D show tables;
┌─────────┐
│  name   │
│ varchar │
├─────────┤
│ 0 rows  │
└─────────┘
D exit

and the DuckDB Python connection:

(venv) cody@dkdcvd ibis % ipy
Python 3.11.4 (main, Jun 30 2023, 16:37:57) [Clang 14.0.3 (clang-1403.0.22.14.1)]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.14.0 -- An enhanced Interactive Python. Type '?' for help.

[ins] In [1]: import duckdb

[ins] In [2]: duckdb.sql("show tables;")

[ins] In [3]: type(duckdb.sql("show tables;"))
Out[3]: NoneType

[ins] In [4]: duckdb.sql("select 1 as id")
Out[4]:
┌───────┐
│  id   │
│ int32 │
├───────┤
│     1 │
└───────┘

[ins] In [6]: type(duckdb.sql("select 1 as id"))
Out[6]: duckdb.DuckDBPyRelation

so the behavior you're seeing in Ibis, returning an empty list, is expected. if you want to create a table in the in-memory connection, you can do something like:

(venv) cody@dkdcvd ibis % ipy
Python 3.11.4 (main, Jun 30 2023, 16:37:57) [Clang 14.0.3 (clang-1403.0.22.14.1)]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.14.0 -- An enhanced Interactive Python. Type '?' for help.

[ins] In [1]: import ibis

[ins] In [2]: t = ibis.examples.penguins.fetch()

[ins] In [3]: con = ibis.connect("duckdb://")

[ins] In [4]: penguins = con.create_table("penguins", t.to_pyarrow(), overwrite=True)

[ins] In [5]: penguins
Out[5]:
DatabaseTable: penguins
  species           string
  island            string
  bill_length_mm    float64
  bill_depth_mm     float64
  flipper_length_mm int64
  body_mass_g       int64
  sex               string
  year              int64

[ins] In [6]: ibis.options.interactive = True

[ins] In [7]: penguins
Out[7]:
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ species ┃ island    ┃ bill_length_mm ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex    ┃ year  ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ string  │ string    │ float64        │ float64       │ int64             │ int64       │ string │ int64 │
├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ Adelie  │ Torgersen │           39.1 │          18.7 │               181 │        3750 │ male   │  2007 │
│ Adelie  │ Torgersen │           39.5 │          17.4 │               186 │        3800 │ female │  2007 │
│ Adelie  │ Torgersen │           40.3 │          18.0 │               195 │        3250 │ female │  2007 │
│ Adelie  │ Torgersen │            nan │           nan │              NULL │        NULL │ NULL   │  2007 │
│ Adelie  │ Torgersen │           36.7 │          19.3 │               193 │        3450 │ female │  2007 │
│ Adelie  │ Torgersen │           39.3 │          20.6 │               190 │        3650 │ male   │  2007 │
│ Adelie  │ Torgersen │           38.9 │          17.8 │               181 │        3625 │ female │  2007 │
│ Adelie  │ Torgersen │           39.2 │          19.6 │               195 │        4675 │ male   │  2007 │
│ Adelie  │ Torgersen │           34.1 │          18.1 │               193 │        3475 │ NULL   │  2007 │
│ Adelie  │ Torgersen │           42.0 │          20.2 │               190 │        4250 │ NULL   │  2007 │
│ …       │ …         │              … │             … │                 … │           … │ …      │     … │
└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘

[ins] In [8]: con.list_tables()
Out[8]: ['penguins', '_ibis_pyarrow_memtable_tigepffhbbcbbo45ke2xkaerxm']

[ins] In [9]: penguins = con.table("penguins")

[ins] In [10]: penguins
Out[10]:
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓
┃ species ┃ island    ┃ bill_length_mm ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex    ┃ year  ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩
│ string  │ string    │ float64        │ float64       │ int64             │ int64       │ string │ int64 │
├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤
│ Adelie  │ Torgersen │           39.1 │          18.7 │               181 │        3750 │ male   │  2007 │
│ Adelie  │ Torgersen │           39.5 │          17.4 │               186 │        3800 │ female │  2007 │
│ Adelie  │ Torgersen │           40.3 │          18.0 │               195 │        3250 │ female │  2007 │
│ Adelie  │ Torgersen │            nan │           nan │              NULL │        NULL │ NULL   │  2007 │
│ Adelie  │ Torgersen │           36.7 │          19.3 │               193 │        3450 │ female │  2007 │
│ Adelie  │ Torgersen │           39.3 │          20.6 │               190 │        3650 │ male   │  2007 │
│ Adelie  │ Torgersen │           38.9 │          17.8 │               181 │        3625 │ female │  2007 │
│ Adelie  │ Torgersen │           39.2 │          19.6 │               195 │        4675 │ male   │  2007 │
│ Adelie  │ Torgersen │           34.1 │          18.1 │               193 │        3475 │ NULL   │  2007 │
│ Adelie  │ Torgersen │           42.0 │          20.2 │               190 │        4250 │ NULL   │  2007 │
│ …       │ …         │              … │             … │                 … │           … │ …      │     … │
└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘

note that if you change your connection to a persistent database file, you can then reconnect to that database later and only the penguins table will exist -- the temporary pyarrow table will be dropped. in normal usage of Ibis, you're probably connecting to a database or reading from files so you'd get a table like that

I'm not sure if @cpcloud or anyone else has context on this DuckDB example but it does seem wrong on recent versions of DuckDB and we should help give a better example in their documentation

mvashishtha commented 1 year ago

@lostmygithubaccount thank you for triaging; I thought this page was in the ibis documentation all along!

gforsyth commented 1 year ago

@lostmygithubaccount there used to be system tables that we would show, but we opted to not present those to users by default. The tutorial page definitely needs updating to include pulling down some example data.

lostmygithubaccount commented 1 year ago

apologies for this taking so long, PR open on the DuckDB docs page: https://github.com/duckdb/duckdb-web/pull/1000

@mvashishtha could you give that a review if you're interested? it's basically a port of our getting started tutorial on the Ibis docs. I'd also be curious to hear more about what you're using Ibis for and if there's anything else we can do to improve! We're working on overhauling our documentation to a better format and improved content.

lostmygithubaccount commented 1 year ago

@mvashishtha closing this out as that PR is merged -- looks like their docs are versioned so I'd expect this to change over in the next DuckDB release

let us know if you have any other questions!