bcgsc / tigmint

⛓ Correct misassemblies using linked AND long reads
https://bcgsc.github.io/tigmint/
GNU General Public License v3.0
54 stars 13 forks source link

Tigmint not making cuts #39

Closed elsemikk closed 3 years ago

elsemikk commented 3 years ago

Hi, Thank you for making tigmint!

I am having a bit of difficulty running tigmint, it seems to run but my output genome is identical to the input genome even when I give it unreasonably strict parameters that should cut something (-n500000 for tigmint-cut). I have 10x chromium linked reads with about 50x coverage and a supernova assembly with N50=9 Mb.

I made the sequence alignment with bwa-mem with:

bwa mem -t24 -pC assembly.fa reads.fq.gz | samtools view -@24 -h -F4 -o alignment.bam
samtools sort --verbosity 3 -@ 24 -m 3G -t BX -o sorted.bam alignment.bam

And then ran tigmint with:

tigmint-molecule -a0.65 -n5 -q0 -d50000 -s2000 sorted.bam | sort -k1,1 -k2,2n -k3,3n > genome.reads.as0.65.nm5.molecule.size2000.bed
samtools faidx assembly.fa
tigmint-cut -p24 -w1000 -n20 -t0 -o assembly.reads.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.fa assembly.fa assembly.reads.as0.65.nm5.molecule.size2000.bed

But it did not seem to make any cuts as the output assembly had the same number of scaffolds as the input. I tried changing the settings of tigmint-cut to -w100 -n500000 -t100 to try force it to cut something but it did not cut anything. Does this indicate that there is a problem with my run, or is there really nothing for it to cut even with very extreme settings?

The output bed files looks like they were produced normally:

head genome.reads.as0.65.nm5.molecule.size2000.bed

0::0:0-42446733 0   2030    CAACCTCTCGGATGCC-1  9
0::0:0-42446733 0   10082   ATACTTCGTGAGGGAG-1  9
0::0:0-42446733 0   18937   CTCTGTGCATTTGCGA-1  66
0::0:0-42446733 0   22013   CCACTACCAGAGATCG-1  65
0::0:0-42446733 0   22785   TTAGGTGCAACTCATG-1  29

head assembly.reads.as0.65.nm5.molecule.size2000.trim0.window1000.span20.breaktigs.fa.bed

0       0       42446733        0
58      0       15111319        58
90      0       30744791        90
114     0       5293578 114
138     0       7375403 138

Thank you

lcoombe commented 3 years ago

Hello @elsemikk,

The first thing I would try is to use the tigmint-make Makefile instead of running each command separately (which I think you did based on the commands above?). That would just rule out any possible issues with the commands.

Using a very high span value would actually have the effect of causing no cuts, which I know is a bit counter intuitive. That's because a contig only has the potential to be cut if the molecule/physical coverage reaches the span value at some point on the contig. (That's just because the physical coverage ramps up and down on the ends of the contig, but we don't want to shred all of the contig ends.) So, unless you have a physical coverage of 500000 on some of your contigs, you won't see any cutting.

Let me know if you are still seeing the same issue when using the makefile and a more reasonable range of span values, and we can troubleshoot from there!

Thanks for your interest in Tigmint! Lauren

elsemikk commented 3 years ago

Thank you! I did run it with separate commands before, so I reran it with tigmint-make as you suggested and this time it made several cuts, so it must have been an issue with my commands.

lcoombe commented 3 years ago

Excellent! I'm glad that solve the issue :)