chhylp123 / BitMapperBS

BitMapperBS: a fast and accurate read aligner for whole-genome bisulfite sequencing
Apache License 2.0
28 stars 9 forks source link

Mapping error #5

Open vanessayang0927 opened 5 years ago

vanessayang0927 commented 5 years ago

Hi, I installed BitMapperBS successfully and runed it without error. However, it seemed to have a conflict in the output .sam. data resource:PE100 Mapping information of the output .sam: image Header information of the output *.sam: image problems: the Mapping position was 89997,reads length was 100bp, 89997+100=90097, this exceeded the length of chr9_gl000198_random, which was 90085. That resulted in a mistake when remove duplicates: image

chhylp123 commented 5 years ago

Hi, I installed BitMapperBS successfully and runed it without error. However, it seemed to have a conflict in the output .sam. data resource:PE100 Mapping information of the output .sam: image Header information of the output *.sam: image problems: the Mapping position was 89997,reads length was 100bp, 89997+100=90097, this exceeded the length of chr9_gl000198_random, which was 90085. That resulted in a mistake when remove duplicates: image

This is a bug, I will fix it immediately. It is just an output error. Thanks very much for your help. If you do not want to run BitMapperBS again, there are three solutions:

  1. Use CleanSam in Picard to soft-clip these reads so they don't map off the end of the reference.
  2. For Picard, pass ALIDATION_STRINGENCY=LENIENT or SILENT to ignore this error.
  3. Just delete these two lines (read1 and read2 of CL100089207L1C010R068_396600/2) in output SAM file.

PS: BitMapperBS (version 1.0.0.3) now can directly output result in BAM format. It will save a lot of time.

chhylp123 commented 5 years ago

Hi, I installed BitMapperBS successfully and runed it without error. However, it seemed to have a conflict in the output .sam. data resource:PE100 Mapping information of the output .sam: image Header information of the output *.sam: image problems: the Mapping position was 89997,reads length was 100bp, 89997+100=90097, this exceeded the length of chr9_gl000198_random, which was 90085. That resulted in a mistake when remove duplicates: image

I have tested Picard successfully with ALIDATION_STRINGENCY=LENIENT or SILENT just now. If you do not want to run BitMapperBS again, I recommend you to run Picard like this. Actually, it is extremely rare when a read hangs off of one chromosome and on to another.

vanessayang0927 commented 5 years ago

I tried the method that you suggested before and it did't remove duplicates actually. this was the code: image results: image

chhylp123 commented 5 years ago

I tried the method that you suggested before and it did't remove duplicates actually. this was the code: image results: image

I will try to reproduce your problem. I found that the input alignment file of Picard is in SAM format. Have you sorted it? The input file of Picard.MarkDuplicates must be coordinate sorted.

I guess you use old version of BitMapperBS (version 1.0.0.2). Its output file is unsorted in SAM format. If you have sorted it by samtools or picard, the output file may be a sorted BAM file.

vanessayang0927 commented 5 years ago

I used old version of BitMapperBS (version 1.0.0.0). The output file was a sorted BAM file after sorted by samtools. But it still did't work. this was the code: image results: image

chhylp123 commented 5 years ago

I used old version of BitMapperBS (version 1.0.0.0). The output file was a sorted BAM file after sorted by samtools. But it still did't work. this was the code: image results: image

In my experiments, Picard reported "Marking 23291334 records as duplicates" to stdout. Have you ever seen this?

I have tested your command again in my datasets, and I could not reproduce this problem. Could you please re-align your dataset with current version of BitMapperBS (version 1.0.0.3)? I note that your dataset only includes about 10 million reads. The new version BitMapperBS only needs a few minutes to process them, and can directly output BAM file (using "--bam" and "-o output.bam" options).

If this problem still exist, could you please send me a small part of your dataset? It would be a great help for me.

Thanks very much!

vanessayang0927 commented 5 years ago

In my experiments, Picard reported "Marking 0 records as duplicates" image And I will try to install BitMapperBS (version 1.0.0.3). Thank you

vanessayang0927 commented 5 years ago

There was only BitMapperBS (version 1.0.0.2) in this web.

chhylp123 commented 5 years ago

There was only BitMapperBS (version 1.0.0.2) in this web.

Have you downloaded the source code of BitMapperBS? Make it from source code will generate BitMapperBS (1.0.0.3). For example, you could: (1) git clone https://github.com/chhylp123/BitMapperBS.git, or you can download using the "Clone or download" button in github. (2) cd BitMapperBS (3) make The index of BitMapperBS does not need to be rebuilt.

Please do not use "BitMapperBS_v1.0.0.2.zip". This file is used unless BitMapperBS (1.0.0.3) cannot be successfully compiled from source code.

vanessayang0927 commented 5 years ago

I had downloaded the source code of BitMapperBS, but didn't installed it successfully. image image

chhylp123 commented 5 years ago

I had downloaded the source code of BitMapperBS, but didn't installed it successfully. image image

I am surprised by this error. Could you please compile "BitMapperBS_v1.0.0.2.zip" first?

In addition, cloud you please send me the information of your CPU and Linux system? This problem may be caused by old CPU without AVX2.

vanessayang0927 commented 5 years ago

I had tried to compile "BitMapperBS_v1.0.0.2.zip" and could not install BitMapperBS_v1.0.0.2. image

chhylp123 commented 5 years ago

I had tried to compile "BitMapperBS_v1.0.0.2.zip" and could not install BitMapperBS_v1.0.0.2. image

Which gcc and linux are you using? Could please send me the error information reported when build BitMapperBS_v1.0.0.2?

vanessayang0927 commented 5 years ago

OK,I will sen you tomorrow Thank you for your reply.

获取 Outlook for iOShttps://aka.ms/o0ukef


收件人: chhylp123 notifications@github.com 发送时间: 星期一, 十二月 10, 2018 6:08 下午 收件人: chhylp123/BitMapperBS 抄送: vanessayang0927; Author 主题: Re: [chhylp123/BitMapperBS] Mapping error (#5)

I had tried to compile "BitMapperBS_v1.0.0.2.zip" and could not install BitMapperBS_v1.0.0.2. [image]https://user-images.githubusercontent.com/45344513/49724894-6939d000-fca5-11e8-936b-38765ff04eee.png

Could please send me the error information reported when build BitMapperBS_v1.0.0.2?

― You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/chhylp123/BitMapperBS/issues/5#issuecomment-445760053, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ArPnAQRVMZu6cnqjZ4uuMsUiYcRfHbN_ks5u3jKKgaJpZM4ZKfrk.

chhylp123 commented 5 years ago

OK,I will sen you tomorrow Thank you for your reply. 获取 Outlook for iOShttps://aka.ms/o0ukef ____ 收件人: chhylp123 notifications@github.com 发送时间: 星期一, 十二月 10, 2018 6:08 下午 收件人: chhylp123/BitMapperBS 抄送: vanessayang0927; Author 主题: Re: [chhylp123/BitMapperBS] Mapping error (#5) I had tried to compile "BitMapperBS_v1.0.0.2.zip" and could not install BitMapperBS_v1.0.0.2. [image]https://user-images.githubusercontent.com/45344513/49724894-6939d000-fca5-11e8-936b-38765ff04eee.png Could please send me the error information reported when build BitMapperBS_v1.0.0.2? ― You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<#5 (comment)>, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ArPnAQRVMZu6cnqjZ4uuMsUiYcRfHbN_ks5u3jKKgaJpZM4ZKfrk.

Thanks very much.

If possible, could you please enter the directory of BitMapperBS (1.0.0.3), and make it by: make AVX2=0

This will generate BitMapperBS (1.0.0.3) from source code without AVX2.

Thanks again for your help to debug BitMapperBS. If you like, I will kindly acknowledge you in my paper.

vanessayang0927 commented 5 years ago

After entering the directory of BitMapperBS (1.0.0.3) and make it by: make AVX2=0, It did't work . the error: image image

vanessayang0927 commented 5 years ago

And this was the version of gcc and linux image image

chhylp123 commented 5 years ago

Have you ever installed zlib, libbz2 and liblzma libraries?Could you please install these libraies first, and then try: make AVX2=0.

I have searched this problem on internet, someone said this is the bug of gcc 4.8. So if you still want to make BitMapperBS with AVX2, could please upgrade it? gcc 4.8 is very old now.

vanessayang0927 commented 5 years ago

Hi, I installed BitMapperBS_v1.0.0.3 successfully and tried to remove duplicates in output *bam by Picard. Picard still reported "Marking 0 records as duplicates" to stdout, but the duplicates existed actually. duplicates existed in sorted output bam file: image my code: image results: image I will try to upload test.bam to github. PS: I had used Picard to remove dup in BSmap output bam file without error.

chhylp123 commented 5 years ago

Hi, I installed BitMapperBS_v1.0.0.3 successfully and tried to remove duplicates in output *bam by Picard. Picard still reported "Marking 0 records as duplicates" to stdout, but the duplicates existed actually. duplicates existed in sorted output bam file: image my code: image results: image I will try to upload test.bam to github. PS: I had used Picard to remove dup in BSmap output bam file without error.

From the alignment results of BitMapperBS, it seems that there are no problems ... I will wait your test.bam. Maybe you can upload it to Baidu Cloud or Google Cloud. Thank you very much.

By the way, how do you solve the problems when building BitMapperBS from source code? I hope I can write a tip in README.

vanessayang0927 commented 5 years ago

this is the website of test.bam: https://github.com/vanessayang0927/BitMapperBS/blob/master/test.bam

vanessayang0927 commented 5 years ago

install BitMapperBS-1.0.0.3

wget https://github.com/chhylp123/BitMapperBS/archive/master.zip unzip master.zip cd BitMapperBS-maste export PATH=/path/to/CMake/v3.13.1/bin:$PATH make -j 8 cd .. mkdir v1.0.0.3 cp -a BitMapperBS-master/bitmapperBS v1.0.0.3 cp -a BitMapperBS-master/psascan v1.0.0.3 cp -a BitMapperBS-master/htslib_aim v1.0.0.3

chhylp123 commented 5 years ago

install BitMapperBS-1.0.0.3

wget https://github.com/chhylp123/BitMapperBS/archive/master.zip unzip master.zip cd BitMapperBS-maste export PATH=/path/to/CMake/v3.13.1/bin:$PATH make -j 8 cd .. mkdir v1.0.0.3 cp -a BitMapperBS-master/bitmapperBS v1.0.0.3 cp -a BitMapperBS-master/psascan v1.0.0.3 cp -a BitMapperBS-master/htslib_aim v1.0.0.3

Thanks very much for your file. I will test it as soon as possible.

So that means BitMapperBS can be made successfully? I would like to know how do you solve the problems that occurred yesterday. Upgrade gcc?

vanessayang0927 commented 5 years ago

BitMapperBS was made successfully with a administrator access

chhylp123 commented 5 years ago

BitMapperBS was made successfully with a administrator access

Well, I think I know what happen ... It is a minor problem. I will solve it this evening. Thanks very much.

chhylp123 commented 5 years ago

this is the website of test.bam: https://github.com/vanessayang0927/BitMapperBS/blob/master/test.bam

Thanks for your data. This is a minor problem in BitMapperBS. I have already fixed it in new version of BitMapperBS (1.0.0.4). In addition, the reads which are mapped off the end of the reference are also reomved from the alignment result in BitMapperBS (1.0.0.4). Thus, you can run Picard.MarkDuplicates normally without ALIDATION_STRINGENCY=LENIENT or SILENT. I highly recommend you to try the new version of BitMapperBS (1.0.0.4). Please download the source code of BitMapperBS (1.0.0.4) from github, and make it again.

When align paired-end reads, this problem is caused by inconsistent QNAMEs (Read ID) of read 1 and read 2 in BAM file. For example, given a paired-end read in your dataset, the Read IDs of read1 and read2 are "CL100089207L1C015R033_267765/1" and "CL100089207L1C015R033_267765/2", respectively. In BAM file generated by previous BitMapperBS, QNAMEs of read1 and read2 are still "CL100089207L1C015R033_267765/1" and "CL100089207L1C015R033_267765/2", respectively. Since the QNAMEs of read1 and read2 are different, Picard.MarkDuplicates does not think read1 and read2 are a paired-end read. In this case, Picard.MarkDuplicates cannot identify any deuplicate paired-end read. In new version of BitMapperBS (version 1.0.0.4), the QNAMEs of both read1 and read2 are "CL100089207L1C015R033_267765". Therefore, Picard.MarkDuplicates thinks they are a paired-end read.

Thanks again for your great help! Maybe you can try new version of BitMapperBS (1.0.0.4). I think all problems have been solved in this version.

vanessayang0927 commented 5 years ago

Hi, I installed BitMapperBS (1.0.0.4) successfully and remove duplicates without error. But there was a contradiction between the output log file and output bam file: the number of unique mapped Reads in log file image the number of unique mapped Reads in bam file image

chhylp123 commented 5 years ago

Hi, I installed BitMapperBS (1.0.0.4) successfully and remove duplicates without error. But there was a contradiction between the output log file and output bam file: the number of unique mapped Reads in log file image the number of unique mapped Reads in bam file image

It is a strange problem. I have tested BitMapperBS on several illumina datasets, and there was no problem using samtools flagstat. I guess maybe your datasets were not generated by illumina platforms.

Could you please send me your BAM file or fastq files? Thank you very much if you can help me to debug BitMapperBS when aligning reads produced by your platforms.

Thanks again for your help!

vanessayang0927 commented 5 years ago

Thanks for your reply! Considering that error did not appear in v1.0.0.0, I did not think that error because of the sequencing platforms. ps: I did not find this problem using other mapping software, including Bismark, Bsmap and so on

chhylp123 commented 5 years ago

Thanks for your reply! Considering that error did not appear in v1.0.0.0, I did not think that error because of the sequencing plat.

Do you have Wechat? Maybe we can discuss by Wechat. I really want know what happen.

chhylp123 commented 5 years ago

Wechat Account:yangqin0927123

I cannot find yangqin0927123 in Wechat... Is your Wechat ID right?