decompress the s3, use a try catch and notify data engineering if you cant. There should be a function to do this.
using the LocalDirectoryStore class (again, try catch and notify data engineering if you cant) confirm we have a config - the LocalDirectoryStore should have a method to support this.
get the config as a dict using LocalDirectoryStore, notify DE if you cant - - the LocalDirectoryStore should have a method to support this.
get the schema for the config using, notify DE if you cant - there should be a function to support this.
validate the config - notify data engineering if its not valid. Use the json validator from dpytools.
trigger the function here with the path to the direcetory of files you have unpacked.
You can use our intiial sketch as a reference point. Just be aware we're only implementing a very small part of it here.
Every single action should be wrapped in a try catch and notifying data engineering in the event of an issue.
Accptance Criteria
[ ] With valid inputs the above steps run and the next function is called.
[ ] With invalid inputs suitable notifications are sent.
What is this
We need to create the umbella function that will run pipelines when given an input which is the url to a tar file in an s3 bucket.
This is not developing new functionality, this is brining together of already written and unit tested components to provide the behaviour we require.
This is the function.
What to do
Start with the assuption you have an s3 url, i.e
this task is to:
decompress the s3, use a try catch and notify data engineering if you cant. There should be a function to do this.
using the LocalDirectoryStore class (again, try catch and notify data engineering if you cant) confirm we have a config - the LocalDirectoryStore should have a method to support this.
get the config as a dict using LocalDirectoryStore, notify DE if you cant - - the LocalDirectoryStore should have a method to support this.
get the schema for the config using, notify DE if you cant - there should be a function to support this.
validate the config - notify data engineering if its not valid. Use the json validator from dpytools.
trigger the function here with the path to the direcetory of files you have unpacked.
You can use our intiial sketch as a reference point. Just be aware we're only implementing a very small part of it here.
Every single action should be wrapped in a try catch and notifying data engineering in the event of an issue.
Accptance Criteria