ComparativeGenomicsToolkit / cactus

Official home of genome aligner based upon notion of Cactus graphs
Other
528 stars 111 forks source link

VCF output for different "--filter N" option #1490

Open shin0727 opened 2 months ago

shin0727 commented 2 months ago

Hi, I'm a Ph.D student in bioinformatics, And I have some questions while using the "minigraph-cactus". I run the program, in order to get some pangenome and vcf files in the graph. I have 24 sequences in the graph, and used the following command. "cactus-pangenome ./jobstorepath ./sequenceFile.tsv --outDir ${PREFIX1} --outName ${PREFIX1} --reference ${REF} --filter 2 --giraffe clip filter --vcf --viz --odgi --chrom-vg clip filter --chrom-og --gbz clip filter full --gfa clip full --vcf --giraffe --gfa --gbz --chrom-vg --logFile ${PREFIX1}.log"

I ran minigraph-cactus twice with the "--filter 2" and "--filter 9" options, and the variant results in the "vcf.gz" are different. I thought the "--filter" option only applies when generating dist, min, and gbz output, but does it affect the vcf results as well?

glennhickey commented 2 months ago

yeah, --filter has no effect on the vcf output.

but there are some non-deterministic parts inside cactus, which means the output can be slightly different between different runs even if the parameters are the same.

if you want to play with the --filter values without affecting the rest of the graph, you can run cactus-graphmap-join (using the vg files in chrom-alignments as input) instead.