Open psyking841 opened 10 months ago
I got scenario 2 working now, i.e., repository = FileSystemMetricsRepository(spark, "s3://bucket/run=0/metrics.json")
can append metrics to a single file named metrics.json. But I am still not sure why it was not working for me before.
Describe the bug Scenario 1 - where
repository = FileSystemMetricsRepository(spark, "s3://bucket/run=0/")
, it generates a file named "run=0". See red box in below Snapshot 1.Scenario 2 - where
repository = FileSystemMetricsRepository(spark, "s3://bucket/run=0/metrics.json")
, it now generates correctly a folder named run=0 as the green box in Snapshot 1. But in Snapshot 2, it does not create the metrics.json file. In Snapshot, it generated 3 files with UUID as file names, each corresponds to a tag.Are these expected behavior?
To Reproduce Steps to reproduce the behavior:
Expected behavior In Scenario 1, I would expect Pydeequ lib to write 3 files under the run=0/ folder. In Scenario 2, I would expect Pydeequ lib to write one file under the run=0/ folder.
Screenshots
Snapshot 1
Snapshot 2
Desktop (please complete the following information):
Additional context Add any other context about the problem here.