chhylp123 / BitMapperBS

BitMapperBS: a fast and accurate read aligner for whole-genome bisulfite sequencing
Apache License 2.0
28 stars 9 forks source link

Toy example and alignment error #1

Closed supermaxiste closed 5 years ago

supermaxiste commented 6 years ago

Hi, I installed BitMapper according to all the instructions and the indexing of the genome works fine. When I try to run the aligner I get the following output and error:

Open /pathtogenome/genome.fa.index sucessfully! Start load hash table! refGenLength = 196243198 Open /pathtogenome/genome.fa.index.bs.pac sucessfully! shapline=34620825 BWT has been loaded! hash table has been loaded! SA_length=392486397 sparse_suffix_array_length=49060800 SA_flag_iterater=7665751 high_occ_table_length=11978 Start alignment in default fast mode. *** Error in '/pathtotools/BitMapperBS/bitmapperBS': double free or corruption (out): 0x00007f2244005fc0 *** Aborted

Do you have any idea of what the problem could be? Also, I think that adding a toy example to your repo with a simple genome file and have a short indexing and aligning run could help the users testing your aligner.

Thank you in advance for your answer.

chhylp123 commented 6 years ago

Hi, I installed BitMapper according to all the instructions and the indexing of the genome works fine. When I try to run the aligner I get the following output and error:

Open /pathtogenome/genome.fa.index sucessfully! Start load hash table! refGenLength = 196243198 Open /pathtogenome/genome.fa.index.bs.pac sucessfully! shapline=34620825 BWT has been loaded! hash table has been loaded! SA_length=392486397 sparse_suffix_array_length=49060800 SA_flag_iterater=7665751 high_occ_table_length=11978 Start alignment in default fast mode. *** Error in '/pathtotools/BitMapperBS/bitmapperBS': double free or corruption (out): 0x00007f2244005fc0 *** Aborted

Do you have any idea of what the problem could be? Also, I think that adding a toy example to your repo with a simple genome file and have a short indexing and aligning run could help the users testing your aligner.

Thank you in advance for your answer.

Actuall I have tested BitMapperBS sucessfully on the genomes of human, mouse, caenorhabditis elegan and arabidopsis thaliana. Could you please send me the name of the reference genome on your experiments?

Thanks for your advance. I will add a toy example including examaple datasets to my repo.

chhylp123 commented 6 years ago

Hi, I installed BitMapper according to all the instructions and the indexing of the genome works fine. When I try to run the aligner I get the following output and error:

Open /pathtogenome/genome.fa.index sucessfully! Start load hash table! refGenLength = 196243198 Open /pathtogenome/genome.fa.index.bs.pac sucessfully! shapline=34620825 BWT has been loaded! hash table has been loaded! SA_length=392486397 sparse_suffix_array_length=49060800 SA_flag_iterater=7665751 high_occ_table_length=11978 Start alignment in default fast mode. *** Error in '/pathtotools/BitMapperBS/bitmapperBS': double free or corruption (out): 0x00007f2244005fc0 *** Aborted

Do you have any idea of what the problem could be? Also, I think that adding a toy example to your repo with a simple genome file and have a short indexing and aligning run could help the users testing your aligner.

Thank you in advance for your answer.

Besides, could you please send me the comand used on your experiments? If this error is a bug, I will fix it as soon as possible.

supermaxiste commented 6 years ago

Hi,

I'm using the Arabidopsis halleri assembly from here. The command I used is the following:

./bitmapperBS --search path/to/genome.fa --seq1 /path/to/reads1.fq.gz --seq2 /path/to/reads2.fq.gz --pe -o outputname.bam -t 6

chhylp123 commented 6 years ago

Thanks very much. I will try to fix this error as soon as possible.

chhylp123 commented 6 years ago

Hi,

I'm using the Arabidopsis halleri assembly from here. The command I used is the following:

./bitmapperBS --search path/to/genome.fa --seq1 /path/to/reads1.fq.gz --seq2 /path/to/reads2.fq.gz --pe -o outputname.bam -t 6

I found that there are a lot of 'R', 'Y', 'S', 'W', 'M' in this .fasta file. Should I simply regard these bases as 'N', or regard 'R'-> G/A, 'M'->A/C, 'W'->A/T ......?

chhylp123 commented 6 years ago

Hi,

I'm using the Arabidopsis halleri assembly from here. The command I used is the following:

./bitmapperBS --search path/to/genome.fa --seq1 /path/to/reads1.fq.gz --seq2 /path/to/reads2.fq.gz --pe -o outputname.bam -t 6

Hi supermaxiste,

I have tested BitMapperBS on Arabidopsis halleri assembly, but BitMapperBS did not report error.

I guess this error occurs due to the reads. Could you please send me the minimum dataset of reads used on your experiment? Thank you so much if you can help me to fix this problem.

supermaxiste commented 6 years ago

Hi,

everything that is not A, T, C, G can be considered N for me. As for the reads, here is a link to a OneDrive folder with part of the reads I'm using. I hope this will help you identify the issue. The reads were aligned with using Bismark (Bowtie 2 mode). Let me know if you manage to reproduce the error.

chhylp123 commented 6 years ago

Hi,

everything that is not A, T, C, G can be considered N for me. As for the reads, here is a link to a OneDrive folder with part of the reads I'm using. I hope this will help you identify the issue. The reads were aligned with using Bismark (Bowtie 2 mode). Let me know if you manage to reproduce the error.

I am sorry I am preparing an interview these days, and didn't see your reply. I will try to reproduce the error as soon as possible. Thanks very much.

chhylp123 commented 6 years ago

Hi,

everything that is not A, T, C, G can be considered N for me. As for the reads, here is a link to a OneDrive folder with part of the reads I'm using. I hope this will help you identify the issue. The reads were aligned with using Bismark (Bowtie 2 mode). Let me know if you manage to reproduce the error.

I find that the file you shared with me is hal_SNPs_scaffold1.vcf. I think it is a .vcf file instead of .fastq file. Could you please send me the .fastq file?

Thanks very much for your help.

supermaxiste commented 6 years ago

Hi, sorry I deleted the files by accident. I uploaded them back now.

chhylp123 commented 6 years ago

Hi, sorry I deleted the files by accident. I uploaded them back now.

Thanks for your help! I have tested the .fq files in my desktop. But BitMapperBS didn't reported any errors.

After check your command carefully , I found that the the format of input files is .fq.gz. Actually, BitMapperBS now only supports .fq foramt. This is because to improve the alignment performance, BitMapperBS is designed to align the read files after QC (eg., QC using Trim_Galore https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/). The format of the output files of Trim_Galore is .fq. In fact, using Trim_Galore to filter input reads firstly is the standard workflow of Bismark. Please uncompressed .fq.gz files to .fq files firstly.

supermaxiste commented 6 years ago

Thank you for your input, I'll test the uncompressed files and let you know.

supermaxiste commented 6 years ago

Hi,

so I also tried a test run on uncompressed data and I get the following error: Open Ahal_genome/Ahal.fa.index sucessfully! Start load hash table! refGenLength = 196243198 Open Ahal_genome/Ahal.fa.index.bs.pac sucessfully! shapline=34620825 BWT has been loaded! hash table has been loaded! SA_length=392486397 sparse_suffix_array_length=49060800 SA_flag_iterater=7665751 high_occ_table_length=11978 Start alignment in default fast mode. Illegal instruction

When using our job queue system I get specifically: line 16: 33724 Illegal instruction (core dumped). I'm not sure the error comes from there, but I have no idea why this happens.

chhylp123 commented 6 years ago

Hi,

so I also tried a test run on uncompressed data and I get the following error: Open Ahal_genome/Ahal.fa.index sucessfully! Start load hash table! refGenLength = 196243198 Open Ahal_genome/Ahal.fa.index.bs.pac sucessfully! shapline=34620825 BWT has been loaded! hash table has been loaded! SA_length=392486397 sparse_suffix_array_length=49060800 SA_flag_iterater=7665751 high_occ_table_length=11978 Start alignment in default fast mode. Illegal instruction

When using our job queue system I get specifically: line 16: 33724 Illegal instruction (core dumped). I'm not sure the error comes from there, but I have no idea why this happens.

Could you please send me the configurations of your machine (CPU/System)? Thanks very much!

supermaxiste commented 6 years ago

CPU:

Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 32 On-line CPU(s) list: 0-31 Thread(s) per core: 1 Core(s) per socket: 8 Socket(s): 4 NUMA node(s): 4 Vendor ID: GenuineIntel CPU family: 6 Model: 45 Model name: Intel(R) Xeon(R) CPU E5-4640 0 @ 2.40GHz Stepping: 7 CPU MHz: 2507.156 CPU max MHz: 2800.0000 CPU min MHz: 1200.0000 BogoMIPS: 4791.85 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 20480K NUMA node0 CPU(s): 0-7 NUMA node1 CPU(s): 8-15 NUMA node2 CPU(s): 16-23 NUMA node3 CPU(s): 24-31

System: Linux 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt25-2 (2016-04-08) x86_64 GNU/Linux

chhylp123 commented 6 years ago

CPU:

Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 32 On-line CPU(s) list: 0-31 Thread(s) per core: 1 Core(s) per socket: 8 Socket(s): 4 NUMA node(s): 4 Vendor ID: GenuineIntel CPU family: 6 Model: 45 Model name: Intel(R) Xeon(R) CPU E5-4640 0 @ 2.40GHz Stepping: 7 CPU MHz: 2507.156 CPU max MHz: 2800.0000 CPU min MHz: 1200.0000 BogoMIPS: 4791.85 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 20480K NUMA node0 CPU(s): 0-7 NUMA node1 CPU(s): 8-15 NUMA node2 CPU(s): 16-23 NUMA node3 CPU(s): 24-31

System: Linux 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt25-2 (2016-04-08) x86_64 GNU/Linux

The reason is that your CPU (E5-4640) does not support AVX2 instructions. Please compile BitMapperBS from BitMapperBS/for_old_machine_sse4.2. You can do this as follows:

cd BitMapperBS/for_old_machine_sse4.2 make

Thanks very much. I will merge these two version of BitMapperBS in the future. Actually I have metioned this problem in README.

supermaxiste commented 6 years ago

Sorry about that, I'm not an expert but I should have checked this earlier.

Thank you a lot for the help!

I'll try to run the old_machine version and if it works we can close this issue. Maybe the readme can specify that fq files need to be uncompressed so that people do not have this issue in the future.

chhylp123 commented 6 years ago

Sorry about that, I'm not an expert but I should have checked this earlier.

Thank you a lot for the help!

I'll try to run the old_machine version and if it works we can close this issue. Maybe the readme can specify that fq files need to be uncompressed so that people do not have this issue in the future.

Actually it is my problem that I have not consider this case. I am sorry about this.

Thanks very much for your sincerely help! If you have any problem, please contact me. I am very happy to recieve your comment.

I am sorry I didn't have enough time to update my README, since I am very busy recently. But I will add a detailed example, and illustrate this problem in README in the next few days

chhylp123 commented 6 years ago

Sorry about that, I'm not an expert but I should have checked this earlier.

Thank you a lot for the help!

I'll try to run the old_machine version and if it works we can close this issue. Maybe the readme can specify that fq files need to be uncompressed so that people do not have this issue in the future.

Hi supermaxiste, have you ever tested BitMapperBS successfully?

supermaxiste commented 6 years ago

Hi, sorry I was busy the last two days. I'll test tomorrow and update asap.

supermaxiste commented 6 years ago

Hi, BitMapperBS ran successfully and I encountered no errors. The issue can be closed. Thank you for the help!

P.s. It could be nice to mention that the output is in SAM format

chhylp123 commented 6 years ago

Hi, BitMapperBS ran successfully and I encountered no errors. The issue can be closed. Thank you for the help!

P.s. It could be nice to mention that the output is in SAM format

Thanks very much for your help! The methylation sites can be extracted by very fast MethylDackel (https://github.com/dpryan79/methyldackel). I will update my README this day.

In addition, I will update BitMapperBS to make it directly output methylation sites, rather than SAM/BAM.