BSSeeker / BSseeker2

A versatile aligning pipeline for bisulfite sequencing data
http://pellegrini.mcdb.ucla.edu/BS_Seeker2/
MIT License
60 stars 25 forks source link

[help] How to speed alignment in BS-seeker2 ? #43

Closed hmyh1202 closed 2 years ago

hmyh1202 commented 2 years ago

Hi, WeiLong:

I used --bt2-p 20 ( bowtie2 aligner) to speed alignment, but the BS-seeker2 only use one CPU. How to spped alignment ?

Thank you!

guoweilong commented 2 years ago

Hi Grace,

It seems that your command is correct. But BS-Seeker2 will use one cpu for pre-processing for a few minutes before launching bowtie2 in multiple-thread mode. Maybe you can wait to see if the bowtie2 is advocated with multiple threads for running.

Best, Weilong

At 2022-03-30 13:53:31, "Grace Bisulfite" @.***> wrote:

Hi, WeiLong:

I used --bt2-p 20 ( bowtie2 aligner) to speed alignment, but the BS-seeker2 only use one CPU. How to spped alignment ?

Thank you!

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

hmyh1202 commented 2 years ago

Hi, weiong:

图片

My command is bs_seeker2-align.py -I 0 -X 700 -m 3 -1 R1.fq.gz -2 R2.fq.gz -t N -f bam --bt2-p 10 --bt2-I 0 --bt2-X 700 --aligner=bowtie2 -g hg38.fa -d ../genome/hg38/bsseeker2/wgbs_index/ --temp_dir ./tmp -o test.bam

I have noticed that after bs_seeker2 finished split fastq reads file, it only take one cpu to run alignment one by one, and the bowtie2 only occupy one thread for each genome index.

So can you share your command and why I would got that issue ?

Thank you !

hmyh1202 commented 2 years ago

And the log was as: [2022-03-31 12:12:38] Finished: bowtie2 --local --quiet -D 50 --no-mixed --norc -I 0 --sam-nohead --no-discordant -k 2 -p 10 -X 700 --fr -x ../genome/hg38_ucsc/bsseeker2/wgbs/hg38.fa_bowtie2/WC2T -f -1 a.bam-bowtie2-local-TMP-Ns0Fd0/Trimed_FCT1.fa.tmp-3640609 -2 b.bam-bowtie2-local-TMP-Ns0Fd0/Trimed_RGA_2.fa.tmp-3640609 -S ./tmp/bs_seeker2test.bam-bowtie2-local-TMP-Ns0Fd0/W_C2T_fr_m3.0.mapping.tmp-3640609

so, the thread of 10 was correct used in that command, but the alignment speed is too slow, and a small dataset have not finished now, however the other aligner such as bsmap and bismark have already finished its job.

guoweilong commented 2 years ago

The time for aligning a large fastq file will soar up for BS-Seeker2. Thus it is suggested to cut large input files into small pieces.

The following link might be helpful:

https://github.com/BSSeeker/BSseeker2#qa11

Best, Weilong

At 2022-03-31 16:23:08, "Grace Bisulfite" @.***> wrote:

And the log was as: [2022-03-31 12:12:38] Finished: bowtie2 --local --quiet -D 50 --no-mixed --norc -I 0 --sam-nohead --no-discordant -k 2 -p 10 -X 700 --fr -x ../genome/hg38_ucsc/bsseeker2/wgbs/hg38.fa_bowtie2/WC2T -f -1 a.bam-bowtie2-local-TMP-Ns0Fd0/Trimed_FCT1.fa.tmp-3640609 -2 b.bam-bowtie2-local-TMP-Ns0Fd0/Trimed_RGA_2.fa.tmp-3640609 -S ./tmp/bs_seeker2test.bam-bowtie2-local-TMP-Ns0Fd0/W_C2T_fr_m3.0.mapping.tmp-3640609

so, the thread of 10 was correct used in that command, but the alignment speed is too slow, and a small dataset have not finished now, however the other aligner such as bsmap and bismark have already finished its job.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

hmyh1202 commented 2 years ago

Yes, I have split the fastq file into small part, and run bs-seeker2 again. I works better now! Thank you.

And, can you share me some simulated paired-end bs-seq fastq data for compare the accuracy of several aligner ?

The Best !

BSSeeker commented 2 years ago

On the page http://pellegrini-legacy.mcdb.ucla.edu/bs_seeker2/, there is a section of "Datasets for paper". These are previously simulated data.

And I guess there should be some tools to simulate BS-seq data being published in recent years (I have not followed these study). You may try to seek well designed tool to do this work.

Best, Weilong

At 2022-04-02 13:47:23, "Grace Bisulfite" @.***> wrote:

Yes, I have split the fastq file into small part, and run bs-seeker2 again. I works better now! Thank you.

And, can you share me some simulated paired-end bs-seq fastq data for compare the accuracy of several aligner ?

The Best !

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>