sodadata / prefect-soda-core

Apache License 2.0
16 stars 4 forks source link

soda_scan_execute improvements: Accepting Environment Variables, Dict-based scan results #17

Closed fakhavan closed 1 year ago

fakhavan commented 1 year ago

Updating the soda_scan_execute task with options to pass environmental variables to the shell task running the Soda command and to save and return the dictionary-based scan results.

Summary

As discussed at https://github.com/sodadata/prefect-soda-core/issues/16, this PR :

Relevant Issue(s)

https://github.com/sodadata/prefect-soda-core/issues/16

Checklist

AlessandroLollo commented 1 year ago

Hey @fakhavan 👋 Would it make sense to return the content of the scan result file as a JSON/Dict for the soda_scan_execute task?

@vijaykiran would love to know your opinion too 🙂

AlessandroLollo commented 1 year ago

@fakhavan with regard to the Static analysis failure, you should be able to fix it by changing isort version to 5.12.0 in .pre-commit-config.yaml

fakhavan commented 1 year ago

Hey @AlessandroLollo I pushed some more changes including the option to retrieve the dictionary-based scan results. I have updated the PR description if you want some visibility on what has changed. Most of these changes are the result of making the task work in my flow.

Drilling down on the exception issue, when the task is used in mapped mode, one task (scan's) failure would cause the whole flow (including other scans) to fail. I tried return_state=True as it's the main way of handling failures but seems like this option does not work in the mapped mode. I also could not find a way to handle this with the underlying shell task. So I resorted to using manual exception catching. However, I might have missed something so I'm looking forward to your suggestions.

Cheers

AlessandroLollo commented 1 year ago

Hey @AlessandroLollo I pushed some more changes including the option to retrieve the dictionary-based scan results. I have updated the PR description if you want some visibility on what has changed. Most of these changes are the result of making the task work in my flow.

Drilling down on the exception issue, when the task is used in mapped mode, one task (scan's) failure would cause the whole flow (including other scans) to fail. I tried return_state=True as it's the main way of handling failures but seems like this option does not work in the mapped mode. I also could not find a way to handle this with the underlying shell task. So I resorted to using manual exception catching. However, I might have missed something so I'm looking forward to your suggestions.

Cheers

Hey @fakhavan 👋 With regards to the issues with mapped mode, I suggest you to ask in the Prefect community!

AlessandroLollo commented 1 year ago

Hey @fakhavan 👋 In order to fix the static analyses failure, you have to run isort and flake8. They should be triggered automatically using git hooks and pre-commit though...