ComparativeGenomicsToolkit / cactus

Official home of genome aligner based upon notion of Cactus graphs
Other
481 stars 106 forks source link

Multiple fasta input, single sample in vcf #1419

Open dmjask opened 1 week ago

dmjask commented 1 week ago

Hi, I'm running minigraph-cactus v2.8.1 for a small region of a genome with the following command:

singularity exec --contain --bind $(pwd):/data $SCRIPTS cactus-pangenome \
/data/jobstore $fasta --outDir /data --outName $outname --reference $ref --workDir /data --vcf

where $SCRIPTS is path to sif, $fasta is the seqfile.txt and $ref is reference Despite providing multiple fastas as input only one (and always the same one) is passed onto cactus_analyseAssembly resulting in being the only sample in the output vcf.

I've attached the log file for the most recent attempt: slurm-14738835.log

Any ideas or suggestions for a fix? Thanks in advance!

glennhickey commented 1 week ago

I'm not sure what you mean about cactus_analyzeAssembly but it doesn't look like minigraph is finding any svs (which could be a sign nothing is aligning?)

[2024-06-21T10:20:42+0200] [MainThread] [I] [toil-rt] [minigraph]: [M::mg_ggsimple_cigar::0.607*1.03] inserted 0 events, including 0 inversions
[2024-06-21T10:20:43+0200] [MainThread] [I] [toil-rt] [minigraph]: [M::mg_ggsimple_cigar::1.076*1.03] inserted 0 events, including 0 inversions
[2024-06-21T10:20:44+0200] [MainThread] [I] [toil-rt] [minigraph]: [M::mg_ggsimple_cigar::1.694*1.03] inserted 0 events, including 0 inversions
[2024-06-21T10:20:44+0200] [MainThread] [I] [toil-rt] [minigraph]: [M::mg_ggsimple_cigar::2.418*1.03] inserted 0 events, including 0 inversions
[2024-06-21T10:20:46+0200] [MainThread] [I] [toil-rt] [minigraph]: [M::mg_ggsimple_cigar::4.500*1.02] inserted 0 events, including 0 inversions
[2024-06-21T10:20:47+0200] [MainThread] [I] [toil-rt] [minigraph]: [M::mg_ggsimple_cigar::5.115*1.02] inserted 0 events, including 0 inversions
[2024-06-21T10:20:48+0200] [MainThread] [I] [toil-rt] [minigraph]: [M::mg_ggsimple_cigar::5.742*1.02] inserted 0 events, including 0 inversions
[2024-06-21T10:20:48+0200] [MainThread] [I] [toil-rt] [minigraph]: [M::mg_ggsimple_cigar::6.440*1.02] inserted 0 events, including 0 inversions
[2024-06-21T10:20:49+0200] [MainThread] [I] [toil-rt] [minigraph]: [M::mg_ggsimple_cigar::6.965*1.02] inserted 0 events, including 0 inversions
[2024-06-21T10:20:49+0200] [MainThread] [I] [toil-rt] [minigraph]: [M::mg_ggsimple_cigar::7.487*1.06] inserted 0 events, including 0 inversions
[2024-06-21T10:20:50+0200] [MainThread] [I] [toil-rt] [minigraph]: [M::mg_ggsimple_cigar::8.154*1.06] inserted 0 events, including 0 inversions
[2024-06-21T10:20:51+0200] [MainThread] [I] [toil-rt] [minigraph]: [M::mg_ggsimple_cigar::8.848*1.06] inserted 0 events, including 0 inversions

You might want to consider looking in your output graph using vg paths and vg stats or halStats to see if anything is aligning.

dmjask commented 4 days ago

Hi, the cactus_analyseAssembly on line 557 of the log file, where only the AG5 sample is noted.

As for minigraph not finding SVs: I have noticed the same records for other regions, where SVs have been successfully detected. The single sample output in the resulting vcf is also noted as 0 events in the log file, despite all variants genotyped as ALT.