Closed shaghayeghsoudi closed 1 year ago
If I remember correctly then your data is a RRBS, is that correct? RRBS typically works by digesting the genome using MspI which results in a number of fragments (maybe something like 600-800K), so you are expected to sequence the very same fragments over and over, which a function of sequencing depth. In other words: it is not recommended apply deduplication on RRBS data as you will lose the majority of your data. The exception to this rule would be if you used UMIs to allow the distinction between genuine fragments (coming from different cells) and PCR duplicates (which would all carry the same UMI).
Here is an RRBS Guide that explains a few more things (hopefully).
Thanks Felix, Yes. That makes sense.
Hi Felix,
Thanks again for awesome softwares. I just proceed with my Diagenode data according to your recommendations and called methylation. Sorry if it is a naive question but I thought If I can ask your opinion about that (thanks a lot in advance and sorry for keep asking questions). I have 150 bp paired end reads and aligned them on --pbat mode using Bismark. Then for the methylation extraction I used the following command:
bismark_methylation_extractor -p ${BAM} --genome ${REF_DIR} -o ${OUTPUT_DIR} --bedGraph --counts --comprehensive --no_overlap
I am getting all the files I should get (*.cov, bedGraph, etc) and my coverage file from "deduplicated Bam files" look like this:
As you see very low coverage! I also visualized my bam files using IGV. So using deduplicated Bam files seems weird with very low coverage.
SRC127-N7_R1_val_1.fq.gz FastQC Report.pdf SRC161-N_R1_val_1.fq.gz FastQC Report.pdf
I used the following Bismark command for the alignment bismark --pbat -1 ${SAMPLE_R1} -2 ${SAMPLE_R2} --bowtie2 --bam --temp_dir ${TEMP_DIR} --genome ${REF_DIR} -o ${OUTPUT_DIR} --un
I should mention I get at least 70% mapping efficiency for 90% of my samples and here is one example [SRC161-N_R1_val_1.fq.gz FastQC Report.pdf](https://github SRC127-N7_R1_val_1.fq.gz FastQC Report.pdf .com/FelixKrueger/Bismark/files/12865149/SRC161-N_R1_val_1.fq.gz.FastQC.Report.pdf)
I really appreciate of If could get your opinion on this issue just to make sure at least I am not missing anything related to using the softwares efficiently. Thank you