Describe the bug
When using a PandasFileSystemDatasource the relative path specified in the great_expectations.yml configuration file seems is taken relative to the current working directory instead of the context_root_dir specified in the call to gx.get_context.
To Reproduce
/home/flepknor/.conda/envs/gx/bin/python /home/flepknor/repos/blabla/scripts/billo_example.py
Traceback (most recent call last):
File " /home/flepknor/repos/blabla/scripts/billo_example.py ", line 4, in <module>
datasource.test_connection()
File "/home/flepknor/.conda/envs/gx/lib/python3.10/site-packages/great_expectations/datasource/fluent/pandas_filesystem_datasource.py", line 55, in test_connection
raise TestConnectionError(
great_expectations.datasource.fluent.interfaces.TestConnectionError: Path: /home/flepknor/repos/blabla/scripts/metadata does not exist.
Yaml config file contents:
# Welcome to Great Expectations! Always know what to expect from your data.
#
# Here you can define datasources, batch kwargs generators, integrations and
# more. This file is intended to be committed to your repo. For help with
# configuration please:
# - Read our docs: https://docs.greatexpectations.io/docs/guides/connecting_to_your_data/connect_to_data_overview/#2-configure-your-datasource
# - Join our slack channel: http://greatexpectations.io/slack
# config_version refers to the syntactic version of this config file, and is used in maintaining backwards compatibility
# It is auto-generated and usually does not need to be changed.
config_version: 3.0
# Datasources tell Great Expectations where your data lives and how to get it.
# Read more at https://docs.greatexpectations.io/docs/guides/connecting_to_your_data/connect_to_data_overview
datasources: {}
# This config file supports variable substitution which enables: 1) keeping
# secrets out of source control & 2) environment-based configuration changes
# such as staging vs prod.
#
# When GX encounters substitution syntax (like `my_key: ${my_value}` or
# `my_key: $my_value`) in the great_expectations.yml file, it will attempt
# to replace the value of `my_key` with the value from an environment
# variable `my_value` or a corresponding key read from this config file,
# which is defined through the `config_variables_file_path`.
# Environment variables take precedence over variables defined here.
#
# Substitution values defined here can be a simple (non-nested) value,
# nested value such as a dictionary, or an environment variable (i.e. ${ENV_VAR})
#
#
# https://docs.greatexpectations.io/docs/guides/setup/configuring_data_contexts/how_to_configure_credentials
config_variables_file_path: uncommitted/config_variables.yml
# The plugins_directory will be added to your python path for custom modules
# used to override and extend Great Expectations.
plugins_directory: plugins/
stores:
# Stores are configurable places to store things like Expectations, Validations
# Data Docs, and more. These are for advanced users only - most users can simply
# leave this section alone.
#
# Three stores are required: expectations, validations, and
# evaluation_parameters, and must exist with a valid store entry. Additional
# stores can be configured for uses such as data_docs, etc.
expectations_store:
class_name: ExpectationsStore
store_backend:
class_name: TupleFilesystemStoreBackend
base_directory: expectations/
validations_store:
class_name: ValidationsStore
store_backend:
class_name: TupleFilesystemStoreBackend
base_directory: uncommitted/validations/
evaluation_parameter_store:
class_name: EvaluationParameterStore
checkpoint_store:
class_name: CheckpointStore
store_backend:
class_name: TupleFilesystemStoreBackend
suppress_store_backend_id: true
base_directory: checkpoints/
profiler_store:
class_name: ProfilerStore
store_backend:
class_name: TupleFilesystemStoreBackend
suppress_store_backend_id: true
base_directory: profilers/
expectations_store_name: expectations_store
validations_store_name: validations_store
evaluation_parameter_store_name: evaluation_parameter_store
checkpoint_store_name: checkpoint_store
data_docs_sites:
# Data Docs make it simple to visualize data quality in your project. These
# include Expectations, Validations & Profiles. The are built for all
# Datasources from JSON artifacts in the local repo including validations &
# profiles from the uncommitted directory. Read more at https://docs.greatexpectations.io/docs/terms/data_docs
local_site:
class_name: SiteBuilder
show_how_to_buttons: true
store_backend:
class_name: TupleFilesystemStoreBackend
base_directory: ../data_docs/
site_index_builder:
class_name: DefaultSiteIndexBuilder
fluent_datasources:
parquet_metadata:
type: pandas_filesystem
assets:
u_metadata:
type: parquet
batching_regex: (?P<year>\d{4})(?P<month>\d{2})(?P<day>\d{2})(?P<hour>\d{2})\.parquet
base_directory: metadata/
notebooks:
include_rendered_content:
globally: false
expectation_suite: false
expectation_validation_result: false
Expected behavior
Would expect the datasource to be configured to point to /tmp/pytest-of-flepknor/pytest-12/test_download_flow_good0/metadata/. This would be in line with
"Using relative paths as the base_directory of a Filesystem Data Source
If you are using a Filesystem Data Context you can provide a path for base_directory that is relative to the folder containing your Data Context."
as stated here
Environment (please complete the following information):
Describe the bug When using a PandasFileSystemDatasource the relative path specified in the great_expectations.yml configuration file seems is taken relative to the current working directory instead of the context_root_dir specified in the call to gx.get_context. To Reproduce
leads to
Yaml config file contents:
Expected behavior Would expect the datasource to be configured to point to
/tmp/pytest-of-flepknor/pytest-12/test_download_flow_good0/metadata/
. This would be in line with "Using relative paths as the base_directory of a Filesystem Data SourceIf you are using a Filesystem Data Context you can provide a path for base_directory that is relative to the folder containing your Data Context." as stated here
Environment (please complete the following information):
Additional context