apache / iceberg-python

Apache PyIceberg
https://py.iceberg.apache.org/
Apache License 2.0
478 stars 176 forks source link

CLI list not working #1122

Open TiansuYu opened 2 months ago

TiansuYu commented 2 months ago

Apache Iceberg version

0.7.1 (latest release)

Please describe the bug 🐞

I am currently trying out the pyiceberg CLI, and found this command seems not working as expected:

pyiceberg list --uri localhost:8181

returns

URI missing, please provide using --uri, the config or environment variable PYICEBERG_CATALOG__DEFAULT__URI

The catalog instance is created from the official quickstart (can confirm it is working by spark-sql and connects to the minio instance and confirm data and metadata files are created correctly). If I understood it correctly, it exposes the catalog at 8181.

I have tried a dozens of others ports and urls (such as s3://warehouse), not working either. Seems the problem is that, the argument is simply not propagated into the config in the end somehow.

kevinjqliu commented 2 months ago

Looks like the catalog inference code requires the URI to include http in front

https://github.com/apache/iceberg-python/blob/dc6d2429aafbffc626cba53aaac3f6198fc37eb3/pyiceberg/catalog/__init__.py#L198

TiansuYu commented 2 months ago

Still the same error:

pyiceberg list --uri http://localhost:8181
URI missing, please provide using --uri, the config or environment variable PYICEBERG_CATALOG__DEFAULT__URI

even if thats the case, the error message is a bit cryptic.

kevinjqliu commented 2 months ago

can you try

pyiceberg --uri http://localhost:8181 list

from https://py.iceberg.apache.org/cli/

➜ pyiceberg --help Usage: pyiceberg [OPTIONS] COMMAND [ARGS]...

kevinjqliu commented 2 months ago

even if thats the case, the error message is a bit cryptic.

agreed, the CLI needs some improvements

TiansuYu commented 2 months ago

Thanks, putting uri in front of commands worked! Another thing is

-> % pyiceberg list --help                     
URI missing, please provide using --uri, the config or environment variable PYICEBERG_CATALOG__DEFAULT__URI

I would expect this will not attempting to make any connection calls, but simply print the help message

kevinjqliu commented 2 months ago

I would expect this will not attempting to make any connection calls, but simply print the help message

yea same, perhaps its some setting with the click lib

https://github.com/apache/iceberg-python/blob/9857107561d2267813b7ce150b01b4e6ac4b3e34/pyiceberg/cli/console.py#L107-L111

TiansuYu commented 2 months ago

I would say load_catalog lazily until actually needed, in the main command.