Closed xiuru closed 6 years ago
Hi Xiuru,
I just had a look at the test data you sent, and treated it with a standard pipeline:
1) Adapter trimming with Trim Galore, command is:
trim_galore --paired test_R1.fastq test_R2.fastq
2) Mapping (the data is directional):
bismark --genome /bi/scratch/Genomes/Zea_mays/ -1 test_R1_val_1.fq -2 test_R2_val_2.fq
The mapping efficiency in default mode is ~60%, and ~68% with --score_min L,0,-0.4
.
So yea I suspect that something with your trimming step didn't work quite as expected, over here it seems just fine. Here is the MultiQC report of the run: multiqc_report.zip
Hi Felix,
Thanks for your quick reply.
I retrim the test data,and run bismark as you told me:
trim_galore --paired test_R1.fq test_R2.fq
bismark --genome /work/xiuru/genome/maize/ -1 test_R1_val_1.fq -2 test_R2_val_2.fq --score_min L,0,-0.4
and the bismark result is as follows:
The ''Sequence pairs analysed in total'' is the same with yours, but the Mapping efficiency is still ~10%.
So i am worried i did something wrong when i build the genome index.
And there are some warnings when i build the bisulfite genome:
job.13145562.out.txt
job.13145562.err.txt
But i got the bismark_genome_preparation results named "Bisulfite_Genome". I am using a masked reference, is this the key to the problem? Shouldn't i use a masked genome,right?
Yes that will be the difference! If you can just get the dna.chromsome and not .rm. or .sm. files you should get the same results.
So only chromosome 1 to 10, not include Pt、Mt and other scaffolds?
The one I used was chromosomes 1 to 10, Pt, Mt as well as nonchromosomal.
Thanks for your patient and quick reply! Thank you very much, Felix! I got the results like yours! Felix is a good guy!
Hi Felix, I met with another problem, Is this because the scaffold is too short? The job still running, can i ignore these warnings?
Yes you should just ignore those warnings, they really only come up for short scaffolds when reads align to the very end (there should also be several closed issues about these warnings). Cheers, Felix
Yes,i saw that closed issue. Thanks,Felix.
Hi Felix, I am trying to map paired-end reads from Bisulfite sequencing using Bismark. The genome i used is Zea_mays.AGPv4.dna_rm.fa download from Ensembl, i build index use the commond:
bismark_genome_preparation --verbose /path_to_genome/
I trim data with trimmomatic,using the following command:
trimmomatic PE in_f.fq.gz in_r.fq.gz out_paired_f out_unpaired_f out_paired_r out_unpaired_r ILLUMINACLIP:universal.fa:2:30:10 LEADING:15 TRAILING:15 SLIDINGWINDOW:4:20 MINLEN:36
And the code for bismark alignment is:
bismark --sam genome -1 fastq_f -2 fastq_r
I got very low mapping efficiency, about 10%-20%. I tried to test --non_directional, set --score_min to L,0,-0.6,map read1 and read2 seperately,and trim data with trim_galore... But the mapping efficiency still under 20%Could you please give me some advise? Thanks in advance.