FelixKrueger / Bismark

A tool to map bisulfite converted sequence reads and determine cytosine methylation states
http://felixkrueger.github.io/Bismark/
GNU General Public License v3.0
386 stars 101 forks source link

big genome index mapping failed #140

Closed hmyh1202 closed 6 years ago

hmyh1202 commented 6 years ago

The Bowtie 2 index of the C->T converted genome seems to be faulty or non-existant ('BS_CT.1.bt2'). Please runthe bismark_genome_preparation before running Bismark The Bowtie 2 index of the C->T converted genome seems to be faulty or non-existant ('BS_CT.2.bt2'). Please runthe bismark_genome_preparation before running Bismark The Bowtie 2 index of the C->T converted genome seems to be faulty or non-existant ('BS_CT.3.bt2'). Please runthe bismark_genome_preparation before running Bismark The Bowtie 2 index of the C->T converted genome seems to be faulty or non-existant ('BS_CT.4.bt2'). Please runthe bismark_genome_preparation before running Bismark The Bowtie 2 index of the C->T converted genome seems to be faulty or non-existant ('BS_CT.rev.1.bt2'). Pleaserun the bismark_genome_preparation before running Bismark The Bowtie 2 index of the C->T converted genome seems to be faulty or non-existant ('BS_CT.rev.2.bt2'). Pleaserun the bismark_genome_preparation before running Bismark The Bowtie 2 index of the G->A converted genome seems to be faulty or non-existant ('BS_GA.1.bt2'). Please runbismark_genome_preparation before running Bismark The Bowtie 2 index of the G->A converted genome seems to be faulty or non-existant ('BS_GA.2.bt2'). Please runbismark_genome_preparation before running Bismark The Bowtie 2 index of the G->A converted genome seems to be faulty or non-existant ('BS_GA.3.bt2'). Please runbismark_genome_preparation before running Bismark The Bowtie 2 index of the G->A converted genome seems to be faulty or non-existant ('BS_GA.4.bt2'). Please runbismark_genome_preparation before running Bismark The Bowtie 2 index of the G->A converted genome seems to be faulty or non-existant ('BS_GA.rev.1.bt2'). Pleaserun bismark_genome_preparation before running Bismark The Bowtie 2 index of the G->A converted genome seems to be faulty or non-existant ('BS_GA.rev.2.bt2'). Pleaserun bismark_genome_preparation before running Bismark

Couldn't find a traditional small Bowtie 2 index for the genome specified (ending in .bt2). Now searching for a large index instead (64-bit index ending in .bt2l)... 64-bit large genome Bowtie 2 index found... FastQ format assumed (by default)

bismark_genome_preparation was used for genome indexing, and big index generated. But when do mapping , no reads mapped to genome !

hmyh1202 commented 6 years ago

A genome of 6Gb

FelixKrueger commented 6 years ago

can you please list the files in the CT_conversion and GA_conversion folders of the Bisulfite_Genome, and then also link the error messages that Bismark produces? Are there any error messages, or do you simply not get any mapping reads? Which version of Bismark were you using?

hmyh1202 commented 6 years ago

├── Bisulfite_Genome │   ├── CT_conversion │   │   ├── BS_CT.1.bt2l │   │   ├── BS_CT.2.bt2l │   │   ├── BS_CT.3.bt2l │   │   ├── BS_CT.4.bt2l │   │   ├── BS_CT.rev.1.bt2l │   │   ├── BS_CT.rev.2.bt2l │   │   └── genome_mfa.CT_conversion.fa │   └── GA_conversion │   ├── BS_GA.1.bt2l │   ├── BS_GA.2.bt2l │   ├── BS_GA.3.bt2l │   ├── BS_GA.4.bt2l │   ├── BS_GA.rev.1.bt2l │   ├── BS_GA.rev.2.bt2l │   └── genome_mfa.GA_conversion.fa ├── genome.fa

Bismark V0.16.3, NO ERROR reported when building genomic index files(big index). Note taht the genome contain many scaffolds. When selected only 100 scaffolds to build index(small index) and do alignment to refernce it wa OK!But it was quickly finished with no reads mapped when using total genome index.

FelixKrueger commented 6 years ago

Hi there, when you say: But it was quickly finished with no reads mapped when using total genome index.

Does that mean that there are no error messages at all and you simply have a low mapping efficiency? Or are there any error messages? Which kind of library prep did you use, were the files trimmed with Trim Galore before mapping, and can you attach the mapping report? The more information you give me to work with the more likely it is that we can find help for your issue.

For the index files, can you do ls -l in the CT and GA folders to see if the index files have the same size in bytes for each of the conversions?

Also, please upgrade to the latest version of Bismark to avoid looking at potentially fixed issues. Cheers, Felix

hmyh1202 commented 6 years ago

No error messages was reported and just report there was no small index files and try to find big index then start to do mapping, but it was finishsed soon with 0 reads mapped!

Did not use Trim Galore.

The big index files seems not wrong: CT_conversion: total 20907243 -rw-r--r-- 1 2000012511 Oct 20 18:57 BS_CT.1.bt2l -rw-r--r-- 1 2905802292 Oct 20 18:57 BS_CT.2.bt2l -rw-r--r-- 1 24914005 Oct 20 16:02 BS_CT.3.bt2l -rw-r--r-- 1 1452901142 Oct 20 16:02 BS_CT.4.bt2l -rw-r--r-- 1 2000012511 Oct 20 21:59 BS_CT.rev.1.bt2l -rw-r--r-- 1 2905802292 Oct 20 21:59 BS_CT.rev.2.bt2l -rw-r--r-- 1 7010713014 Oct 20 16:00 genome_mfa.CT_conversion.fa GA_conversion: total 20907651 -rw-r--r-- 1 2000012511 Oct 20 18:36 BS_GA.1.bt2l -rw-r--r-- 1 2905802292 Oct 20 18:36 BS_GA.2.bt2l -rw-r--r-- 1 24914005 Oct 20 16:02 BS_GA.3.bt2l -rw-r--r-- 1 1452901142 Oct 20 16:02 BS_GA.4.bt2l -rw-r--r-- 1 2000012511 Oct 20 21:24 BS_GA.rev.1.bt2l -rw-r--r-- 1 2905802292 Oct 20 21:24 BS_GA.rev.2.bt2l -rw-r--r-- 1 7010713014 Oct 20 16:01 genome_mfa.GA_conversion.fa

FelixKrueger commented 6 years ago

OK the look fine. If there were no errors then you have probably done something wrong, e.g. used Read 1 and Read 2 files of paired-end sequencing that do not belong together? Could you describe in more detail what you have done exactly, i.e. which type of library prep you did (or the kit used) because this will determine which kind of alignment mode you need to use. Also, please supply the version numbers and the exact commands you used. If your sequencing was paired-end, can you try to align Read 1 in single-end mode only to see if that works? Also, can you run your data through Trim Galore first to see if poor qualtities of adapter contamination cause. If all this doesn't work out well can you send me a couple of hundred thousand reads (untrimmed gzipped FastQ) via email, and mention which genome you were using exactly so I can run a few tests.

Cheers, Felix

hmyh1202 commented 6 years ago

Well, no trim and no any other treatment for bismark script and fq files. It was OK when select part chromosome to build indedx and run bismark. I will try again, and wheat genome file can be test.