Open devcovato opened 2 months ago
I can confirm this behavior as the unit tests for pg_replication do not work. Here is an example:
(dlt-verified-sources-py3.10) drnick@drnick-XPS-15-7590:~/repos/github/neuromantik33/verified-sources/tests $ cat ../.env
SOURCES__PG_REPLICATION__CREDENTIALS__DATABASE=dlt_data
SOURCES__PG_REPLICATION__CREDENTIALS__PASSWORD=loader
SOURCES__PG_REPLICATION__CREDENTIALS__HOST=localhost
SOURCES__PG_REPLICATION__CREDENTIALS__DRIVERNAME=postgresql
SOURCES__PG_REPLICATION__CREDENTIALS__USERNAME=loader
ALL_DESTINATIONS=["duckdb"]
(dlt-verified-sources-py3.10) drnick@drnick-XPS-15-7590:~/repos/github/neuromantik33/verified-sources/tests $ set -a
(dlt-verified-sources-py3.10) drnick@drnick-XPS-15-7590:~/repos/github/neuromantik33/verified-sources/tests $ source ../.env
(dlt-verified-sources-py3.10) drnick@drnick-XPS-15-7590:~/repos/github/neuromantik33/verified-sources/tests $ set +a
(dlt-verified-sources-py3.10) drnick@drnick-XPS-15-7590:~/repos/github/neuromantik33/verified-sources/tests $ env | grep SOURCES
SOURCES__PG_REPLICATION__CREDENTIALS__DATABASE=dlt_data
SOURCES__PG_REPLICATION__CREDENTIALS__PASSWORD=loader
SOURCES__PG_REPLICATION__CREDENTIALS__HOST=localhost
SOURCES__PG_REPLICATION__CREDENTIALS__DRIVERNAME=postgresql
SOURCES__PG_REPLICATION__CREDENTIALS__USERNAME=loader
(dlt-verified-sources-py3.10) drnick@drnick-XPS-15-7590:~/repos/github/neuromantik33/verified-sources/tests $ env | grep ALL
DEB_PYTHON_INSTALL_LAYOUT=deb
ALL_DESTINATIONS=[duckdb]
(dlt-verified-sources-py3.10) drnick@drnick-XPS-15-7590:~/repos/github/neuromantik33/verified-sources/tests $ env | grep DESTINATION__POSTGRES__CREDENTIALS
DESTINATION__POSTGRES__CREDENTIALS=postgresql://loader:loader@localhost:5432/dlt_data
(dlt-verified-sources-py3.10) drnick@drnick-XPS-15-7590:~/repos/github/neuromantik33/verified-sources/tests $ python
Python 3.10.12 (main, Jul 29 2024, 16:56:48) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import dlt
>>> dlt.secrets
<dlt.common.configuration.accessors._SecretsAccessor object at 0x78812b2092a0>
>>> dlt.secrets.get("sources.pg_replication.credentials")
>>> from dlt.common.configuration.specs import ConnectionStringCredentials
>>> dlt.secrets.get("sources.pg_replication.credentials", ConnectionStringCredentials)
>>> from dlt.destinations.impl.postgres.configuration import PostgresCredentials
>>> dlt.secrets.get("sources.pg_replication.credentials", PostgresCredentials)
>>> import os
>>> {k:v for k,v in os.environ.items() if k.startswith("SOURCES")}
{'SOURCES__PG_REPLICATION__CREDENTIALS__DATABASE': 'dlt_data', 'SOURCES__PG_REPLICATION__CREDENTIALS__PASSWORD': 'loader', 'SOURCES__PG_REPLICATION__CREDENTIALS__HOST': 'localhost', 'SOURCES__PG_REPLICATION__CREDENTIALS__DRIVERNAME': 'postgresql', 'SOURCES__PG_REPLICATION__CREDENTIALS__USERNAME': 'loader'}
>>>
@devcovato Are you reading the secrets with dlt.secrets.get
only for debugging purposes? Or are you using this in your production code too? Can you post your secrets toml (without the variables) and the env file so I can have a look?
@neuromantik33 dlt.secrets.get will access a single value in your secrets and not resolve whole groups. If you want your example to work you'd need to set the full connection string as the credentials argument:
import dlt
import os
from dlt.common.configuration.specs import ConnectionStringCredentials
os.environ["SOURCES__PG_REPLICATION__CREDENTIALS"] = "postgresql://loader:loader@localhost:5432/dlt_data"
print(dlt.secrets.get("sources.pg_replication.credentials", ConnectionStringCredentials))
Hi @sh-rp,
@devcovato Are you reading the secrets with
dlt.secrets.get
only for debugging purposes? Or are you using this in your production code too?
I don't read for debugging purposes. You can see at the top that I change the credentials at runtime. The source is a multi-tenant using an isolated database per tenant.
Below is the file secrets.toml
(redacted):
[sources.sql_database.credentials]
drivername = "mysql+pymysql"
database = "mycompany"
username = "user"
password = "password"
host = "host"
port = "3306"
And here is the file .env
:
DAGSTER_ENVIRONMENT=dev
DESTINATION__POSTGRES__CREDENTIALS__CONNECT_TIMEOUT=15
DESTINATION__POSTGRES__CREDENTIALS__DATABASE=analytics_dev
DESTINATION__POSTGRES__CREDENTIALS__HOST=localhost
DESTINATION__POSTGRES__CREDENTIALS__PASSWORD=postgres
DESTINATION__POSTGRES__CREDENTIALS__PORT=5432
DESTINATION__POSTGRES__CREDENTIALS__USERNAME=postgres
LOAD__TRUNCATE_STAGING_DATASET=true
RUNTIME__DLTHUB_TELEMETRY=false
RUNTIME__LOG_LEVEL=WARNING
SOURCES__SQL_DATABASE__CREDENTIALS__DATABASE=mycompany
SOURCES__SQL_DATABASE__CREDENTIALS__DRIVERNAME='mysql+pymysql'
SOURCES__SQL_DATABASE__CREDENTIALS__HOST=host
SOURCES__SQL_DATABASE__CREDENTIALS__PASSWORD='password'
SOURCES__SQL_DATABASE__CREDENTIALS__POST=3306
SOURCES__SQL_DATABASE__CREDENTIALS__USERNAME=user
Can you post your secrets toml (without the variables) and the env file so I can have a look?
Did you mean env vars?
@devcovato did you manage to solve this?
Hi @sh-rp,
@devcovato did you manage to solve this?
The issue is still present. As a workaround I created the file <root_project>/.dlt/secrets.toml
.
@devcovato If you are setting env values you want to have change, you'll have to overwrite the env vars in code. Check out this snippet:
import dlt
import os
if __name__ == "__main__":
os.environ["SOURCES__SQL_DATABASE__CREDENTIALS__DATABASE"] = "test1"
os.environ["SOURCES__SQL_DATABASE__CREDENTIALS__USER"] = "test2"
os.environ["SOURCES__SQL_DATABASE__CREDENTIALS__PASSWORD"] = "test3"
os.environ["SOURCES__SQL_DATABASE__CREDENTIALS__HOST"] = "test4"
os.environ["SOURCES__SQL_DATABASE__CREDENTIALS__PORT"] = "test5"
# will not work
print(dlt.secrets.get("sources.sql_database.credentials"))
# will work
print(dlt.secrets.get("sources.sql_database.credentials.database"))
# this will not change the value, but it probably should
dlt.secrets["sources.sql_database.credentials.database"] = "new_value"
print(dlt.secrets.get("sources.sql_database.credentials.database"))
# overwrite the env var will work
os.environ["SOURCES__SQL_DATABASE__CREDENTIALS__DATABASE"] = "new_value"
print(dlt.secrets.get("sources.sql_database.credentials.database"))
Hi @sh-rp,
I work on a different task now. I will try in the next few days and let you know.
dlt version
0.5.4
Describe the problem
I'm using
dlt
withdagster
. I created asql_database
source targeting postgres, and I'm using the file.env
to load the configuration. For one of two assets (dlt pipeline), I have to change the source credentials database at runtime. I read the secrets ascredentials = dlt.secrets.get("sources.sql_database.credentials")
and thencredentials["database"] = f"db{context.partition_key}"
. Doing so, I got the exception messageTypeError: 'NoneType' object does not support item assignment
. I created the file.dlt/secrets.toml
in my root project and everything works. From the community, I got the suggestiondlt.secrets["sources.sql_database.credentials.database"] = f"db{context.partition_key}"
and didn't work either. It looks likedlt.secrets
is read-only.Expected behavior
I should use only
.env
to set up the secrets and be able to update the configuration at runtime.Steps to reproduce
Below are steps to reproduce the issue:
<root_project>/<dagster_project>/dlt/sources
<root_project>/<dagster_project>/dlt/sources
and executedlt init sql_database postgres
<root_project>/<dagster_project>/dlt/sources/.dlt
<root_project>/.env
and add the needed configuration for source and destinationOperating system
macOS
Runtime environment
Docker, Docker Compose
Python version
3.11
dlt data source
sql_database
dlt destination
Postgres
Other deployment details
No response
Additional information
I'm using Python 3.12