Replaced existing pipeline config sections of s3_tar_received.start() with CONFIGURATION dictionary approach.
get_dataset_id() placeholder function added, to be updated once we know where the dataset_id can be extracted from.
get_pipeline_config() function added to get the correct configuration details from CONFIGURATION (with unit tests).
start() now calls the function specified in the CONFIGURATION["secondary_function"] field, rather than calling dataset_ingress_v1() directly.
90:
pipeline_config added to dataset_ingress_v1() arguments.
Replaced existing pipeline config sections of dataset_ingress_v1() with CONFIGURATION dictionary approach.
get_required_files_patterns() and get_supplementary_distributions_patterns() updated.
dpypelines/pipeline/shared/config.py and dpypelines/pipeline/shared/details.py deleted as no longer needed, along with associated tests, fixtures and test cases.
Acceptance tests amended - new steps for valid/invalid pipeline config, config.json removed from data-fixtures.zip and feature file tables.
How to review
Sanity check, make sure it makes sense. I was getting a credentials error when trying to run start() with an actual S3 bucket, so if you can get it to run and confirm it works, that would be great. One question - previously, we were using the local_store to gain access to the has_lone_file_matching() and get_lone_matching_json_as_dict() methods. Now it only uses the get_current_source_pathlike() method, but the local store is created again in the dataset_ingress_v1() function - it might be possible to tidy this up a bit - suggestions welcome.
What
89:
s3_tar_received.start()
withCONFIGURATION
dictionary approach.get_dataset_id()
placeholder function added, to be updated once we know where thedataset_id
can be extracted from.get_pipeline_config()
function added to get the correct configuration details fromCONFIGURATION
(with unit tests).start()
now calls the function specified in theCONFIGURATION["secondary_function"]
field, rather than callingdataset_ingress_v1()
directly.90:
pipeline_config
added todataset_ingress_v1()
arguments.dataset_ingress_v1()
withCONFIGURATION
dictionary approach.get_required_files_patterns()
andget_supplementary_distributions_patterns()
updated.dpypelines/pipeline/shared/config.py
anddpypelines/pipeline/shared/details.py
deleted as no longer needed, along with associated tests, fixtures and test cases.config.json
removed fromdata-fixtures.zip
and feature file tables.How to review
Sanity check, make sure it makes sense. I was getting a credentials error when trying to run
start()
with an actual S3 bucket, so if you can get it to run and confirm it works, that would be great. One question - previously, we were using thelocal_store
to gain access to thehas_lone_file_matching()
andget_lone_matching_json_as_dict()
methods. Now it only uses theget_current_source_pathlike()
method, but the local store is created again in thedataset_ingress_v1()
function - it might be possible to tidy this up a bit - suggestions welcome.Who can review
Anyone.