broadinstitute / gatk-sv

A structural variation pipeline for short-read sequencing
BSD 3-Clause "New" or "Revised" License
162 stars 71 forks source link

Can't find the vcf_stats file(MergeVcfWideStats.merged_bed_file), how to generate it? #606

Open jingydz opened 9 months ago

jingydz commented 9 months ago

Bug Report

$ Rscript plot_sv_vcf_distribs.R -N $( cat 4388.nyuwa.after.change.lst | sort | uniq | wc -l ) -S SV_colors.txt nvwaCHBCHS.4388.genotype.vcffilter.vcf.gz.gz.stats plotQC_vcfwide_output/ Error in read.table(INFILE, comment.char = "", sep = "\t", header = T, : more columns than column names Execution halted

Affected module(s) or script(s)

plot_sv_vcf_distribs.R

Description

I want to display the length distribution of SV, and I found the specific code as follows:

Plot VCF-wide distributions

/opt/sv-pipeline/scripts/vcf_qc/plot_sv_vcf_distribs.R \ -N $( cat ~{samples_list} | sort | uniq | wc -l ) \ -S /opt/sv-pipeline/scripts/vcf_qc/SV_colors.txt \ ~{vcf_stats} \ plotQC_vcfwide_output/

Prep outputs

tar -czvf ~{prefix}.plotQC_vcfwide_output.tar.gz \ plotQC_vcfwide_output

Question

But I'm curious what file the 'vcf_stats' should be? The source code shows 'vcf_stats=MergeVcfWideStats.merged_bed_file' How did you obtain this file?

epiercehoffman commented 7 months ago

The file is generated by a previous task, MergeVcfWideStats. We recommend running the entire workflow, MainVcfQc, in order to generate all required files in an automated way. If you are trying to run the scripts manually you'd need to run all the previous tasks from the workflow as well.