Closed rickymagner closed 12 months ago
Just made some changes to try to simplify the code and also require indices. I tried making index files optional, but this led to the discovery of a Cromwell bug for one particular implementation (issue opened in their repo), so I abandoned that for now.
There has been some ongoing discussion about a possible Terra bug where GATK cannot stream from a requester-pays bucket. This is particularly important for this workflow in being used for fingerprinting samples against our NIST requester-pays mirror, so I'll try to follow up on that in the future depending on how the Terra support ticket resolves, which may or may not require changes to this code to resolve. For now, this workflow should work fine streaming any files from normal GCP buckets.
Good catch. Just updated the dockstore yml now
This PR adds a utility for handling various applications of fingerprinting. In particular, it should have functionality to perform the following:
It should be simple to drop into WDLs which require matched files for the same samples (like benchmarking query vs truth data), and use the resulting
matched_pairs
output of the task to ensure you only act on fingerprint-matched pairs of files. See the README edits for some more details.