Uses src/sv-pipeline/scripts/downstream_analysis_and_filtering/determine_svcount_outliers.R for plotting and outlier determination which only considers SV types with a median SVs per sample of at least 100
Takes per-contig VCFs as input
Only performs outlier determination based on autosomes
Can rerun with new inputs and settings to separately perform SV counting, outlier determination at different thresholds, and filtering without redoing previous steps
Includes bcftools preprocessing step to restrict SVs considered during outlier determination
Filters sample list
Can provide list of additional (ex. withdrawn) samples to exclude at the same time as outlier removal
Testing
Tested on 1kgp reference panel with different settings and inputs.
Marking as draft while development for Phase 2 is ongoing. Designed for Phase 2 usage so may need changes to be more generally applicable.
Updates
New workflow to remove outlier samples.
src/sv-pipeline/scripts/downstream_analysis_and_filtering/determine_svcount_outliers.R
for plotting and outlier determination which only considers SV types with a median SVs per sample of at least 100Testing
Tested on 1kgp reference panel with different settings and inputs.
Marking as draft while development for Phase 2 is ongoing. Designed for Phase 2 usage so may need changes to be more generally applicable.