FelixKrueger / Bismark

A tool to map bisulfite converted sequence reads and determine cytosine methylation states
http://felixkrueger.github.io/Bismark/
GNU General Public License v3.0
386 stars 101 forks source link

Reads not aligning to the genome #415

Closed ajwije closed 3 years ago

ajwije commented 3 years ago

I have 6 samples from Whole-genome bisulfite Illumina PE sequencing. Each file has 60-80 million reads. Two samples aligned well (70% and 72%). The other four samples had very few alignments (less than 1000 alignments). I checked both reads and both had more than 30 Phred quality across the reads and all insert lengths are less than 500bp based on Bioanalyzer traces from libraries. I also performed a BLAST analysis using a few reads and most of them aligned to the desired genome as expected. Here the code chunk I used: bismark --non_directional $Genome_path -1 $FIRST_SAMPLE_LOC \ -2 $SECCOND_SAMPLE_LOC -o $ALIGN I can try to align two reads separately but wanted to see if you had any other suggestions.

FelixKrueger commented 3 years ago

Hi Asela,

Would you be able to send me a few samples reads, e.g. 100-200,000 reads completely raw and untrimmed (that should fit into an email). I could then run a few tests and get back to you?

ajwije commented 3 years ago

Thanks Felix! I have sent it to your email address listed on Github. Hope it is fine. Asela

From: Felix Krueger notifications@github.com Reply-To: FelixKrueger/Bismark reply@reply.github.com Date: Friday, February 19, 2021 at 1:37 PM To: FelixKrueger/Bismark Bismark@noreply.github.com Cc: Asela Wijeratne awijeratne@astate.edu, Author author@noreply.github.com Subject: Re: [FelixKrueger/Bismark] Reads not aligning to the genome (#415)

Hi Asela,

Would you be able to send me a few samples reads, e.g. 100-200,000 reads completely raw and untrimmed (that should fit into an email). I could then run a few tests and get back to you?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_FelixKrueger_Bismark_issues_415-23issuecomment-2D782296737&d=DwMCaQ&c=QzRQJlHx0ZTYmlwGx7ptjrPEeuNmnYRxm_FN73lod7w&r=yTkFeVur6dhFo3Lq3Al4Nd5Q46C8tsk_K14gwEQ_Buo&m=OdawVzY8AjWPz0vON6p5u2qcAQXdkqAZcdb1l94qUjE&s=LK-ZVWVN1lyGJFykHAUNEZ6kKkeoOOMyifIog9VqOYc&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AB6IZT3ZQLMYC3DQDSAK74DS724ZBANCNFSM4X43WZDA&d=DwMCaQ&c=QzRQJlHx0ZTYmlwGx7ptjrPEeuNmnYRxm_FN73lod7w&r=yTkFeVur6dhFo3Lq3Al4Nd5Q46C8tsk_K14gwEQ_Buo&m=OdawVzY8AjWPz0vON6p5u2qcAQXdkqAZcdb1l94qUjE&s=MUsvzBRIp_oyAVgZJpwHKl5fOZ1ExI3UEkSQfPj04Ns&e=.

FelixKrueger commented 3 years ago

I guess I should have remembered to ask which genome your samples are supposed to align against? It seems pretty clear that is it not a mammalian genome (possibly some plant?).

Otherwise the qualities look fine, so I am not sure if I will be able to point you to something more specific, but I’m happy to take another look if you can let me know the intended species..

Cheers, Felix

From: Asela Wijeratne notifications@github.com Sent: 19 February 2021 19:58 To: FelixKrueger/Bismark Bismark@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [FelixKrueger/Bismark] Reads not aligning to the genome (#415)

I have 6 samples from Whole-genome bisulfite Illumina PE sequencing. Each file has 60-80 million reads. Two samples aligned well (70% and 72%). The other four samples had very few alignments (less than 1000 alignments). I checked both reads and both had more than 30 Phred quality across the reads and all insert lengths are less than 500bp based on Bioanalyzer traces from libraries. I also performed a BLAST analysis using a few reads and most of them aligned to the desired genome as expected. Here the code chunk I used: bismark --non_directional $Genome_path -1 $FIRST_SAMPLE_LOC \ -2 $SECCOND_SAMPLE_LOC -o $ALIGN I can try to align two reads separately but wanted to see if you had any other suggestions.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/FelixKrueger/Bismark/issues/415, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABMZHLOZIBRA2IT7R33ZU3TS72YEXANCNFSM4X43WZDA.

ajwije commented 3 years ago

I will send you the genome I used. Asela

From: Felix Krueger notifications@github.com Reply-To: FelixKrueger/Bismark reply@reply.github.com Date: Saturday, February 20, 2021 at 2:21 AM To: FelixKrueger/Bismark Bismark@noreply.github.com Cc: Asela Wijeratne awijeratne@astate.edu, Author author@noreply.github.com Subject: Re: [FelixKrueger/Bismark] Reads not aligning to the genome (#415)

I guess I should have remembered to ask which genome your samples are supposed to align against? It seems pretty clear that is it not a mammalian genome (possibly some plant?).

Otherwise the qualities look fine, so I am not sure if I will be able to point you to something more specific, but I’m happy to take another look if you can let me know the intended species..

Cheers, Felix

From: Asela Wijeratne notifications@github.com Sent: 19 February 2021 19:58 To: FelixKrueger/Bismark Bismark@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [FelixKrueger/Bismark] Reads not aligning to the genome (#415)

I have 6 samples from Whole-genome bisulfite Illumina PE sequencing. Each file has 60-80 million reads. Two samples aligned well (70% and 72%). The other four samples had very few alignments (less than 1000 alignments). I checked both reads and both had more than 30 Phred quality across the reads and all insert lengths are less than 500bp based on Bioanalyzer traces from libraries. I also performed a BLAST analysis using a few reads and most of them aligned to the desired genome as expected. Here the code chunk I used: bismark --non_directional $Genome_path -1 $FIRST_SAMPLE_LOC \ -2 $SECCOND_SAMPLE_LOC -o $ALIGN I can try to align two reads separately but wanted to see if you had any other suggestions.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/FelixKrueger/Bismark/issues/415, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABMZHLOZIBRA2IT7R33ZU3TS72YEXANCNFSM4X43WZDA.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_FelixKrueger_Bismark_issues_415-23issuecomment-2D782585207&d=DwMFaQ&c=QzRQJlHx0ZTYmlwGx7ptjrPEeuNmnYRxm_FN73lod7w&r=yTkFeVur6dhFo3Lq3Al4Nd5Q46C8tsk_K14gwEQ_Buo&m=sPocQI_3HzOlmTITJcLpfGESBChh1ueQp7dq0DyJ_8A&s=LzYBW2Tn3eyBihOqjR6jwRMwxOJz8DKed5YVseIEGOU&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AB6IZT4IVU3766HRK7XREFDS75WITANCNFSM4X43WZDA&d=DwMFaQ&c=QzRQJlHx0ZTYmlwGx7ptjrPEeuNmnYRxm_FN73lod7w&r=yTkFeVur6dhFo3Lq3Al4Nd5Q46C8tsk_K14gwEQ_Buo&m=sPocQI_3HzOlmTITJcLpfGESBChh1ueQp7dq0DyJ_8A&s=RbATiMCVpLTKFjXIE1X1FMegfVf3xYY-nVEu5QeAq5Y&e=.

FelixKrueger commented 3 years ago

I have tried to align the reads to the genome you sent, or the soy bean genome I downloaded from Ensembl - the reads seem to align perfectly well in both single-end (~67%) and paired-end conditions (72%). Please see attached the MultiQC reports.

SE PE reports.zip

Not sure why there are problems on your side? And by the way, the reads are directional and do NOT require the option --non_directional. Cheers, Felix

FelixKrueger commented 3 years ago

Haven't heard back for a while, I assume all is well.

ajwije commented 3 years ago

Sorry Felix - I realized that I haven’t replied. Yes, it worked. Thanks for your help. Asela

On Jul 20, 2021, at 9:44 AM, Felix Krueger @.**@.>> wrote:

This Message Is From an External Sender This message came from outside your organization.

Haven't heard back for a while, I assume all is well.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/FelixKrueger/Bismark/issues/415*issuecomment-883451866__;Iw!!Og5diRyjJbFE2AE!Ov9qmTnAeLWXBkBCd9oj8YiJHSD2E0kgQatT74Nus6Rm4nRieVbNAwhhS6J0iRyZELI$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AB6IZT5AAFEB7DL5WYLDPW3TYWDV7ANCNFSM4X43WZDA__;!!Og5diRyjJbFE2AE!Ov9qmTnAeLWXBkBCd9oj8YiJHSD2E0kgQatT74Nus6Rm4nRieVbNAwhhS6J09ONdCmE$.

FelixKrueger commented 3 years ago

excellent, had a bit of a spring clean today. Glad it worked in the end!