haowenz / chromap

Fast alignment and preprocessing of chromatin profiles
https://haowenz.github.io/chromap/
MIT License
189 stars 20 forks source link

compared with Bowtie2. #130

Open ming1211 opened 1 year ago

ming1211 commented 1 year ago

Thanks for your great software, which is much more time-saving! we used to use bowtie2. Comparing with it, the results are different between these 2 softwares. Much appreciated if you can help me with finding the cause. the summary of bowtie2: bowtie2 -x $BOWTIE2_INDEXES/$bowtie_index -p 8 --very-sensitive -U $t

29017006 reads; of these: 29017006 (100.00%) were unpaired; of these: 702096 (2.42%) aligned 0 times 4217340 (14.53%) aligned exactly 1 time 24097570 (83.05%) aligned >1 times 97.58% overall alignment rate

the summary of chromap: chromap --preset atac -x $chromap_index -r $genome -1 $t --SAM -o t_chromap.sam"

Mapped all reads in 762.83s. Number of reads: 29017006. Number of mapped reads: 27813346. Number of uniquely mapped reads: 11230663. Number of reads have multi-mappings: 16582683. Number of candidates: 445294315. Number of mappings: 27813346. Number of uni-mappings: 11230663. Number of multi-mappings: 16582683. Sorted, deduped and outputed mappings in 31.06s. uni-mappings: 6071354, multi-mappings: 2036773, total: 8108127. Number of output mappings (passed filters): 4441998 Total time: 829.07s.

which showed that the reads aligned 0 times of chromap are much more than those of bowtie2, and reads aligned >1times are on the contrary. I guess the cause is the parameter of bowtie2 which I set --very-sensitive.(the same as -D 20 -R 3 -N 0 -L 20 -i S,1,0.50)

So i want to turn to you for the parameters of chromap which can be set as the --very-sensitive. And I wonder if there are any severe influence with the differences, or I can just ignored the differences to generate bigwig files.

Expect your reply!! Happy Chinese New Year!

haowenz commented 1 year ago

It seems that you are mapping single-end bulk ATAC-seq data? Can you provide more details on your data? Why is it single-end ATAC-seq? And how long are the reads?

ming1211 commented 1 year ago

@haowenz Thanks for your rapid reply even though it's still the New year holiday. You are right that it is single-end bulk ATAC-seq data. the length of raw reads is 76bp. I don't know why we sequenced single-end, I'm new to the group and new to Chromatin. If it matters, I can discuss then reply to you.

haowenz commented 1 year ago

I think usually people get paired-end reads for ATAC-seq.

I recall Bowtie2 very sensitive mode just sample more seeds and maybe tolerate more errors in the reads, which makes it very slow. For 76bp reads, I doubt it would makes results much more accurate than its default mode and there might be false positive mappings.

Finally, the number of mappings output by Chromap is 4441998 after filtering. This includes a filter to drop those mappings with low MAPQ. This number of is roughly on par with Bowtie2's number 4217340.