vmware / versatile-data-kit

One framework to develop, deploy and operate data workflows with Python and SQL.
Apache License 2.0
424 stars 56 forks source link

Documentation improvement: Data job step execution and error handling/fail scenario #1570

Open zverulacis opened 1 year ago

zverulacis commented 1 year ago

What is the feature request? What problem does it solve? From the problem definition interviews:

It is not clear from the README: - nothing mentioning error handling in the readme (SUPER important) - how jobs fail scenario - dedicated page where it's readable - I wasn't able to find anything at all on error handling - The part about how jobs fail if a step fails and the functionality of troubleshooting is not easy to find - Why is the job prefixed with 10? Info is hard to find - job naming convention isn't clear from the readme

Suggested solution Explain in the README the steps and why the job is prefixed by 10 and how the fail scenario works and errors are handled

Additional context This can be documented in Wiki and added as a link in README, or a short sentence in README any other way.

antoniivanov commented 1 year ago

I think an idea is to provide a lifecycle diagram similar to https://github.com/vmware/versatile-data-kit/wiki/User-Guide#data-jobs-development-workflow and explain what is happening at each stage, what are possible error scenarios and how the yare handled. Explain how errors are routed (platform vs end user).

sabadzhiev commented 1 year ago

Triaged. We will keep it, as being a feature that we would like to address.