OpenGene / gencore

Generate duplex/single consensus reads to reduce sequencing noises and remove duplications
MIT License
116 stars 31 forks source link

output bam problem #16

Open xin8you opened 4 years ago

xin8you commented 4 years ago

Output bam file appear unsort region, but Iinput file is sorted bam. I have run dozens of sample, and this sample appear this problem. image

cmd: gencore -i XYD0160643_L1.raw_data_mapped.bam -o XYD0160643_L1.consensus.bam -r human_g1k_v37.fasta -b target.bed -j HO0330-XYD0160643_L1.json -h HO0330-XYD0160643_L1.html

antonkulaga commented 4 years ago

I double on that. If the input bam is sorted the output of gencore should also be sorted, otherwise I will have to sort it again that will kill any runtime benefits that gencore brings in comparison with Picard. 'cause sorting is loooooooong and having to sort once again adds many hours.

xin8you commented 4 years ago

I check it again,the input bam was sorted. I can share data for you to test for bugs. Probably because of some other problem, it doesn't need to be reordered.

xin8you commented 4 years ago

hi:

    This is the test data, please check.

Best wishes to you!

------------------ 原始邮件 ------------------ 发件人: "Anton Kulaga"<notifications@github.com>; 发送时间: 2020年4月17日(星期五) 上午10:38 收件人: "OpenGene/gencore"<gencore@noreply.github.com>; 抄送: "zhengxin"<919178663@qq.com>;"Author"<author@noreply.github.com>; 主题: Re: [OpenGene/gencore] output bam problem (#16)

I double on that. If the input bam is sorted the output of gencore should also be sorted, otherwise I will have to sort it again that will kill any benefits that gencore brings in comparison with Picard.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

从QQ邮箱发来的超大附件

HO0330-XYD0160643_L1.raw_data_mapped.bam (701.69M, 2020年05月17日 11:14 到期)进入下载页面:http://mail.qq.com/cgi-bin/ftnExs_download?k=3c3033302d8c1de2c6bdfa0612650b195454520151560d524c07510600480d0e05541e080551091b0251565506000a5202030500344d397e2e0000030448616f2500020604530d053e7c021e46044e69055147516b0858461155571e560454365c&t=exs_ftn_download&code=a0304e96

HO0330-XYD0160643_L1.raw_data_mapped.bam.bai (4.16M, 无限期)进入下载页面:http://mail.qq.com/cgi-bin/ftnExs_download?k=52376265b547dbcd94baab534339541d01560654530e07061e5404575c14520005004f5d0409521f0a550001560f5002525657506515667a7c07515655143e6b77075353550f52016c7b534b1758116d575616043a5407424352064b07580b1c51560b6558&t=exs_ftn_download&code=37bee9f2

sfchen commented 4 years ago

The output of gencore is sorted.

If your output is unsorted, please check the version of your gencore, confirm that it is the latest version.

If it's still unsorted. Upload the data here, I will do a test.

If this problem is addressed, please close it.

xin8you commented 4 years ago

The output of gencore is sorted.

If your output is unsorted, please check the version of your gencore, confirm that it is the latest version.

If it's still unsorted. Upload the data here, I will do a test.

If this problem is addressed, please close it.

OK. I will check the version of gencore .

antonkulaga commented 4 years ago

@sfchen I've just checked the doc, in the docs you actually say "gencore accepts a sorted BAM/SAM with its corresponding reference fasta as input, and outputs an unsorted BAM/SAM." So, you mention unsorted output from sorted input in README. Judging from your commend, you updated it to produce sorted output in the latest version, didn't you? RIght now, I am using my own Docker container ( https://github.com/antonkulaga/biocontainers/blob/master/quality-control/gencore/Dockerfile ), where I use http://opengene.org/gencore/gencore , is the version there the newest one or should I build from github master branch instead there? Actually, the version I have in the docker container tells me it is 0.14.0 while latest official release (both github and anaconda cloud) is 0.13.0, I am somewhat confused