Closed VikArz02 closed 3 years ago
Hi @VikArz02
Splitting files with 18% mapping efficiency into smaller chunks should still give you an overall result with 18% mapping efficiency. If you could drop me an email with a few sample reads (e.g. 100K, gzipped, untrimmed reads), I can take a quick look for you. Please also include the genome of interest, and the sample prep you used. Best, Felix
Thank you for your help! Best, Viktoriia
вт, 21 сент. 2021 г. в 14:51, Felix Krueger @.***>:
Hi @VikArz02 https://github.com/VikArz02
Splitting files with 18% mapping efficiency into smaller chunks should still give you an overall result with 18% mapping efficiency. If you could drop me an email with a few sample reads (e.g. 100K, gzipped, untrimmed reads), I can take a quick look for you. Please also include the genome of interest, and the sample prep you used. Best, Felix
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/FelixKrueger/Bismark/issues/460#issuecomment-923906743, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARTXBMGTR22CS6OWEJKTOVTUDBWVRANCNFSM5EOGZVHQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
--
so, are you going to send over some data? I am sure we could rescue some data!
I have already sent it
вт, 21 сент. 2021 г. в 15:47, Felix Krueger @.***>:
so, are you going to send over some data? I am sure we could rescue some data!
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/FelixKrueger/Bismark/issues/460#issuecomment-923953140, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARTXBMB6WAUV3ZDPIDJSZULUDB5ETANCNFSM5EOGZVHQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
-- С уважением, Виктория
After taking a quick look, the data seems to be non-directional human data, my guess would be prepared with the Zymo Pico-methyl kit? because of extensive bias at the 5' end, I ran Trim Galore like this:
trim_galore --paired --clip_r1 15 --clip_r2 15 sample_R1.fastq.gz sample_R2.fastq.gz
followed by:
bismark --genome ../GRCh38/ --non_directional --score_min L,0,-0.4 -1 sample_R1_val_1.fq.gz -2 sample_R2_val_2.fq.gz
This brought the mapping efficiency up to > 51% unique alignments, so quite a nice increase I'd say. Attached is the MultiQC report.
I hope this is gives you something to work with?
Yes, thank you very much for your help
вт, 21 сент. 2021 г. в 17:34, Felix Krueger @.***>:
After taking a quick look, the data seems to be non-directional human data, my guess would be prepared with the Zymo Pico-methyl kit? because of extensive bias at the 5' end, I ran Trim Galore like this:
trim_galore --paired --clip_r1 15 --clip_r2 15 sample_R1.fastq.gz sample_R2.fastq.gz
followed by:
bismark --genome ../GRCh38/ --score_min L,0,-0.4 -1 sample_R1_val_1.fq.gz -2 sample_R2_val_2.fq.gz
multiqc_report.zip https://github.com/FelixKrueger/Bismark/files/7203921/multiqc_report.zip
This brought the mapping efficiency up to > 51% unique alignments, so quite a nice increase I'd say. Attached is the MultiQC report.
I hope this is gives you something to work with?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/FelixKrueger/Bismark/issues/460#issuecomment-924050304, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARTXBMALW32ZRUEEOBHXX4TUDCJY5ANCNFSM5EOGZVHQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
-- С уважением, Виктория
But can i ask, why do we have so low mapping efficiency if it's human genome?
вт, 21 сент. 2021 г. в 17:37, Viktoriia Arzumanian < @.***>:
Yes, thank you very much for your help
вт, 21 сент. 2021 г. в 17:34, Felix Krueger @.***>:
After taking a quick look, the data seems to be non-directional human data, my guess would be prepared with the Zymo Pico-methyl kit? because of extensive bias at the 5' end, I ran Trim Galore like this:
trim_galore --paired --clip_r1 15 --clip_r2 15 sample_R1.fastq.gz sample_R2.fastq.gz
followed by:
bismark --genome ../GRCh38/ --score_min L,0,-0.4 -1 sample_R1_val_1.fq.gz -2 sample_R2_val_2.fq.gz
multiqc_report.zip https://github.com/FelixKrueger/Bismark/files/7203921/multiqc_report.zip
This brought the mapping efficiency up to > 51% unique alignments, so quite a nice increase I'd say. Attached is the MultiQC report.
I hope this is gives you something to work with?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/FelixKrueger/Bismark/issues/460#issuecomment-924050304, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARTXBMALW32ZRUEEOBHXX4TUDCJY5ANCNFSM5EOGZVHQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
--
That is a tricky question, and not something I can give you a perfect answer to. Let's phrase it this way: The best mapoign effieciencies agains the human genome I have seen were in the region of 85-88% using end-to-end alignment, and very good quality standard, directional 2x100bp data.
Your data isn't that, it is non-directional, with weird biases at the start, and I have no idea how it was generated. PBAT-style data suffers froma range of issues such as chimearic reads (https://sequencing.qcfail.com/articles/pbat-libraries-may-generate-chimaeric-read-pairs/), 5' biases (https://sequencing.qcfail.com/articles/mispriming-in-pbat-libraries-causes-methylation-bias-and-poor-mapping-efficiencies/), as well as standard paired-end issues (see e.g. here: https://github.com/FelixKrueger/Bismark/blob/master/Docs/FAQ.md#low-mapping-effiency-of-paired-end-bisulfite-seq-sample).
I have just tried a test with different stringencies on a single end read, and this seems to have quite some impact on both the mapping efficiency as well as the average methylation levels:
--score_min L,0,-0.2: 49%
--score_min L,0,-0.4: 57%
--score_min L,0,-0.6: 66%
So something in your library preparation might also be introducing errors... If possible I would recommend using a more straight forward (directional) kit, but if you have very low starting material you might be limited in your choices...
Okey, it's a very useful answer. I clarified which kit we used QIAseq Methyl Library (Qiagen) and we analyse HepG2 and aligned it to the human genome. Maybe the problem is in it.
вт, 21 сент. 2021 г. в 18:01, Felix Krueger @.***>:
That is a tricky question, and not something I can give you a perfect answer to. Let's phrase it this way: The best mapoign effieciencies agains the human genome I have seen were in the region of 85-88% using end-to-end alignment, and very good quality standard, directional 2x100bp data.
Your data isn't that, it is non-directional, with weird biases at the start, and I have no idea how it was generated. PBAT-style data suffers froma range of issues such as chimearic reads ( https://sequencing.qcfail.com/articles/pbat-libraries-may-generate-chimaeric-read-pairs/), 5' biases ( https://sequencing.qcfail.com/articles/mispriming-in-pbat-libraries-causes-methylation-bias-and-poor-mapping-efficiencies/), as well as standard paired-end issues (see e.g. here: https://github.com/FelixKrueger/Bismark/blob/master/Docs/FAQ.md#low-mapping-effiency-of-paired-end-bisulfite-seq-sample ).
I have just tried a test with different stringencies on a single end read, and this seems to have quite some impact on both the mapping efficiency as well as the average methylation levels:
--score_min L,0,-0.2: 49% --score_min L,0,-0.4: 57% --score_min L,0,-0.6: 66%
So something in your library preparation might also be introducing errors... If possible I would recommend using a more straight forward (directional) kit, but if you have very low starting material you might be limited in your choices...
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/FelixKrueger/Bismark/issues/460#issuecomment-924075189, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARTXBMCMQ7LOWAXENFFH5VDUDCM6LANCNFSM5EOGZVHQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Could be that HepG2 is just a little different to the standard human genome... What does the Qiagen say about bioinformatics processing downstream? But yea, all in all it's not bad!
Felix, hi! I did it according to your pipeline, but I got only 22.7% unique alignments. You had > 51%. Why can't i repeat your result? Thanks for your help! With best regard, Viktoriia
вт, 21 сент. 2021 г. в 19:16, Felix Krueger @.***>:
Closed #460 https://github.com/FelixKrueger/Bismark/issues/460.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/FelixKrueger/Bismark/issues/460#event-5336918661, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARTXBMBY3X2FITTEYCCJVJ3UDCVVNANCNFSM5EOGZVHQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Hmm, what did you do exactly, and were there any error messages? Do you have enough system resources available?
What i did:
And there can only be this error, if it can be called such "Library is assumed to be strand-specific (directional), alignments to strands complementary to the original top or bottom strands will be ignored (i.e. not performed!) Setting parallelization to single-threaded (default)"
вт, 12 окт. 2021 г. в 10:51, Felix Krueger @.***>:
Hmm, what did you do exactly, and were there any error messages? Do you have enough system resources available?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/FelixKrueger/Bismark/issues/460#issuecomment-940756140, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARTXBMCXDLOH4Q4CGCNQL5DUGPSIRANCNFSM5EOGZVHQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
-- С уважением, Виктория
And so i prepared genome file: ~/.../Bismark-0.22.3/bismark_genome_preparation --bowtie2 Homo_sapiens.GRCh38.dna.primary_assembly.fa
вт, 12 окт. 2021 г. в 11:00, Viktoriia Arzumanian < @.***>:
What i did:
- trim_galore --paired --clip_r1 15 --clip_r2 15 sample_R1.fastq.gz sample_R2.fastq.gz
- bismark --genome /GRCh38/ --score_min L,0,-0.4 -1 sample_R1_val_1.fq.gz -2 sample_R2_val_2.fq.gz I have this available system resources: MEM 157G, threads 32, Swp 8G. And i tried only files that I sent to you.
And there can only be this error, if it can be called such "Library is assumed to be strand-specific (directional), alignments to strands complementary to the original top or bottom strands will be ignored (i.e. not performed!) Setting parallelization to single-threaded (default)"
вт, 12 окт. 2021 г. в 10:51, Felix Krueger @.***>:
Hmm, what did you do exactly, and were there any error messages? Do you have enough system resources available?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/FelixKrueger/Bismark/issues/460#issuecomment-940756140, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARTXBMCXDLOH4Q4CGCNQL5DUGPSIRANCNFSM5EOGZVHQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
--
--
According to a note further above I mentioned that the data looks non-directional, but I seem to have omitted that in the command itself (now fixed).
Just repeat the alignments with --non_directional
, then the results should be the same.
Apologies, Felix.
Thanks so much, I got the same result!
вт, 12 окт. 2021 г. в 11:06, Felix Krueger @.***>:
According to a note further above I mentioned that the data looks non-directional, but I seem to have omitted that in the command itself (now fixed).
Just repeat the alignments with --non_directional, then the results should be the same.
Apologies, Felix.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/FelixKrueger/Bismark/issues/460#issuecomment-940767003, or unsubscribe https://github.com/notifications/unsubscribe-auth/ARTXBMBRV4IZ7ISTUN44RADUGPUBHANCNFSM5EOGZVHQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Hey! When I analyze the whole human methylome in the bismark, I get a low level of alignment (18%). I checked it in the FastqScreen program, everything is fine with the reads and the quality is also good. The trimming was carried out according to the recommendations. I thought that there was not enough capacity and because of this, an error might occur Thus, I decided to split the R1 and R2 files into 3 GB files. I aligned them, conducted an analysis for each, and then combined all the output data. Each file has leveled off by about 16-18%. Does such a decision take place?