This runs a Concordance test between the Previous Callset's 'SpiceUpSVIDs' stage (where IDs were manually created based on descriptive variant attributes), and the current Callset's FilterGenotypes output. This gives us a chance to annotate the current VCF contents with the previous callset's IDs.
The SpiceUpSVIDs stage has been edited - now instead of always generating a new ID, the script will accept an ID in the TRUTH_VID VCF annotation (Truth Variant ID, i.e. ID of this same variant in the 'truth'/previous callset). If TRUTH_VID is empty for a variant, a new ID is created using the existing logic.
NB
At this point these Spicy IDs have been manually set in a previous run for Seqr cohorts.
VCGS data has not been run through that stage yet, but a run is currently in progress. To cater to this use-case, there's now a behaviour switch depending on whether a Spicy-ID VCF was available from a prior run.
Also there's a couple of line count reduction/linting changes
Adds in an extra stage
UpdateStructuralVariantIDs
This runs a Concordance test between the Previous Callset's 'SpiceUpSVIDs' stage (where IDs were manually created based on descriptive variant attributes), and the current Callset's
FilterGenotypes
output. This gives us a chance to annotate the current VCF contents with the previous callset's IDs.The SpiceUpSVIDs stage has been edited - now instead of always generating a new ID, the script will accept an ID in the
TRUTH_VID
VCF annotation (Truth Variant ID, i.e. ID of this same variant in the 'truth'/previous callset). If TRUTH_VID is empty for a variant, a new ID is created using the existing logic.NB At this point these Spicy IDs have been manually set in a previous run for Seqr cohorts. VCGS data has not been run through that stage yet, but a run is currently in progress. To cater to this use-case, there's now a behaviour switch depending on whether a Spicy-ID VCF was available from a prior run.
Also there's a couple of line count reduction/linting changes