sodadata / soda-core

:zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
https://go.soda.io/core-docs
Apache License 2.0
1.87k stars 204 forks source link

Invalid configuration header: expected "data_source {data source name}" #2072

Closed andreyolv closed 5 months ago

andreyolv commented 5 months ago

pip install soda-core pip install soda-core-postgres

configuration.yml

data_source my_postgres:
  type: postgres
  connection:
    host: localhost
    username: postgres
    password: postgres123
  database: postgres
  schema: transactional

Postgres Table: image

checks.yml

checks for d_month:
- schema:
    fail:
      when required column missing:
        - month_id
        - action_month
- missing_count(action_month) = 0:
    name: All products have a key

soda test-connection -c configuration.yml -d my_postgres

[10:26:22] Soda Core 3.3.3
Successfully connected to 'my_postgres'.
Connection 'my_postgres' is valid.

soda scan -c configuration.yml -d my_postgres -c checks.yml -V

[10:26:53] Soda Core 3.3.3
[10:26:53] Reading configuration file "configuration.yml"
[10:26:53] Reading configuration file "checks.yml"
[10:26:53] Invalid configuration header: expected "data_source {data source name}".
  +-> line=None,col=None in checks.yml
[10:26:53] No checks file specified
[10:26:53] Scan execution starts
[10:26:53] Scan summary:
[10:26:53] No valid checks found, 0 checks evaluated.
[10:26:53] 1 errors.
[10:26:53] Oops! 1 error. 0 failures. 0 warnings. 0 pass.
ERRORS:
[10:26:53] Invalid configuration header: expected "data_source {data source name}".
  +-> line=None,col=None in checks.yml

What am I doing wrong? It would be great if the documentation could have simple, clear end-to-end examples like this

tools-soda commented 5 months ago

SAS-3356

m1n0 commented 5 months ago

hi, you are passing in the checks file as a config file using the -c parameter. Check files need to be passed in as a positional argument, with no prefix, i.e

soda scan -c configuration.yml -d my_postgres checks.yml -V

I am not sure what exactly are you missing from docs, our online docs have multiple examples, e.g. Take a sip of Soda and even the open source docs have complete docs and examples like this one.

edit: I recommend joining our Slack community where you can ask questions and get assistance from the whole community, Github issues usually take a lot longer to respond and resolve