great-expectations / great_expectations_action

A GitHub Action that makes it easy to use Great Expectations to validate your data pipelines in your CI workflows.
MIT License
78 stars 12 forks source link

python script to build ephemeral data docs site without side effects #4

Closed Aylr closed 4 years ago

Aylr commented 4 years ago

@hamelsmu This creates a new data docs site, builds only that site (to prevent cloud side effects) and prints the directory. This should be much cleaner than parsing stdout.

There are a few notes in here about details and future parameterization should a user want side effects (data docs to be built on their cloud stores).

Feel free to merge at will.

Aylr commented 4 years ago

Also, I noticed that this python script seemed to be somewhat slower than running the CLI checkpoints which automatically create these docs? Is that because this code is also running something else as in addition to what the CLI does? (Just want to learn).

It should not be slower since it's really doing less than a checkpoint. I might investigate...

Also, noob question on my part: why build a fresh docs site instead of relying on the location of the local docs site specified in great_expectations.yaml, is it because a local docs site may not necessarily be specified there? Or is it because you may want to save all of the various versions of the docs under its appropriate SHA out there so you can push to GitHub pages if you wanted to serve various versions of this site? Anyways just wanted to learn what you had in mind, so I can try to do something cool with it :)

Good question! A few reasons: 1. it's possible a user may not have a local site configured. 2. Since GE will upload docs to cloud stores if they are configured, my thought process was that we may want to prevent those side effects until we understand how this Action gets used. Hence the temporary store.