Env vars for UPLOAD_SERVICE_URL, UPLOAD_SERVICE_S3_BUCKET setting upload_url and s3_bucket variables
Florence access token set from get_florence_access_token() function in dpypelines/pipeline/shared/utils.py.
UploadClient created from upload_url using create_upload_client() in dpypelines/pipeline/shared/utils.py.
CSV file uploaded to Upload Service.
Supplementary distributions uploaded to Upload Service (only if file extension is ".xml"), using get_supplementary_distribution_file() function from dpypelines/pipeline/shared/utils.py.
Raise NotImplementedError if supplementary distribution file extension is not .xml.
features/dataset_ingress_v1.feature:
Steps added so that the backend Flask app captures the outgoing requests from the UploadClient.
features/steps/dataset_ingress.py:
Added a new valid_no_supp_dist entry to the CONFIGURATION dictionary, as the test will fail if more than one HTTP request is made, so we aren't currently testing that the supplementary distributions are also being uploaded (I checked on sandbox, and it's definitely working, though).
features/docker/fake_backend/app.py:
Explicit GET and POST methods added to @app.route("/<path:path>" (as uploading is a POST request).
JSON content of request captured as this-requests-json logging statement (added silent=True to stop it failing if there's no JSON content to get).
features/environment.py:
Env vars for UPLOAD_SERVICE_URL, UPLOAD_SERVICE_S3_BUCKET and FLORENCE_TOKEN are set to acceptable values for acceptance tests in before_all(), and reverted to original values in after_all().
features/steps/requests.py:
_parse_request_body_from_log() function added to get content that can't be parsed as a dictionary (as is the case for the CSV file).
_parse_dict_from_log() - refactored _parse_request_headers_as_dict_from_log() to be reusable for getting both headers and JSON content from logs.
How to review
Set env vars for UPLOAD_SERVICE_URL, UPLOAD_SERVICE_S3_BUCKET and FLORENCE_TOKEN, then run dataset_ingress_v1() with an appropriate input directory of files and pipeline config.
Check appropriate try...except structure used, and logging statements are capturing everything needed.
What
dpypelines/pipeline/dataset_ingress_v1.py
:UPLOAD_SERVICE_URL
,UPLOAD_SERVICE_S3_BUCKET
settingupload_url
ands3_bucket
variablesget_florence_access_token()
function indpypelines/pipeline/shared/utils.py
.UploadClient
created fromupload_url
usingcreate_upload_client()
indpypelines/pipeline/shared/utils.py
.get_supplementary_distribution_file()
function fromdpypelines/pipeline/shared/utils.py
.NotImplementedError
if supplementary distribution file extension is not.xml
.features/dataset_ingress_v1.feature
:UploadClient
.features/steps/dataset_ingress.py
:valid_no_supp_dist
entry to theCONFIGURATION
dictionary, as the test will fail if more than one HTTP request is made, so we aren't currently testing that the supplementary distributions are also being uploaded (I checked on sandbox, and it's definitely working, though).features/docker/fake_backend/app.py
:GET
andPOST
methods added to@app.route("/<path:path>"
(as uploading is aPOST
request).this-requests-json
logging statement (addedsilent=True
to stop it failing if there's no JSON content to get).features/environment.py
:UPLOAD_SERVICE_URL
,UPLOAD_SERVICE_S3_BUCKET
andFLORENCE_TOKEN
are set to acceptable values for acceptance tests inbefore_all()
, and reverted to original values inafter_all()
.features/steps/requests.py
:_parse_request_body_from_log()
function added to get content that can't be parsed as a dictionary (as is the case for the CSV file)._parse_dict_from_log()
- refactored_parse_request_headers_as_dict_from_log()
to be reusable for getting both headers and JSON content from logs.How to review
Set env vars for
UPLOAD_SERVICE_URL
,UPLOAD_SERVICE_S3_BUCKET
andFLORENCE_TOKEN
, then rundataset_ingress_v1()
with an appropriate input directory of files and pipeline config.Check appropriate
try...except
structure used, and logging statements are capturing everything needed.Who can review
Anyone.