As a less temporary solution to the genomic file access problem:
On POST/PATCH:
Submitter additionally specifies whose access credential domain should control access to the file. If it's ours (gen3.kidsfirst.co.uk), the submitted url gets loaded into our gen3 and we set access_urls as [our gen3 domain + the latest_did]. Otherwise access_urls is the [submitted url(s)].
Dataservice stores size/hashes/etc directly instead of later relying on responses from another server.
On GET:
Show size/hashes/etc directly from dataservice instead of reflecting values from another server.
access_urls will be the new field that indicates where one should go to access the file.
raw_urls (or whatever) could be optionally added if the file is one of ours, reflecting what our gen3 knows about the file location(s).
Portal ETL will:
Directly copy the access_urls field for file access.
Either parse which credentials to use from each access url or we could optionally include a field that specifically indicates just the access credential domain.
Forked from https://github.com/kids-first/kf-api-dataservice/issues/483
As a less temporary solution to the genomic file access problem:
access_urls examples:
["https://data.gmkf.net/gen3/3b82fad9-55da-402f-a446-c86029720ff3"] or ["https://api.gdc.cancer.gov/data/3b82fad9-55da-402f-a446-c86029720ff3"] but not ["s3://kf-study-buckets-lol/3b82fad9-55da-402f-a446-c86029720ff3.bam"]