stschiff / msmc

Implementation of the multiple sequential markovian coalescent
GNU General Public License v3.0
87 stars 20 forks source link

MSMC error #40

Closed kentsing closed 4 years ago

kentsing commented 6 years ago

Dear @stschiff I'm new for msmc2, and when i'm trying i meet an error


Haplotype index exceeds number of haplotypes in datafile

and also:

error in parsing command line: object.Exception@model/data.d(189): chromosomes must all be the same within one file (sorry) can one file only contain data of one scaffold/chr?

Thanks

stschiff commented 6 years ago

Hi,

what's not to understand about these error messages? The first one suggests that you are using an index number in the -I flag that exceeds the number of haplotypes that are actually in your file. Note that indexing here is 0-based. The second error suggests that you have put multiple chromosome into one file, which is not allowed. You need to separate chromosomes into separate files and give all those files to MSMC on the command line.

Best, Stephan

kentsing commented 6 years ago

Hi Thanks for you reply. I finally solved the problem. Thanks again!

----- 原始邮件 ----- 发件人: "Stephan Schiffels" notifications@github.com 收件人: "stschiff/msmc" msmc@noreply.github.com 抄送: "kentsing" xingk@mail.sysu.edu.cn, "Author" author@noreply.github.com 发送时间: 星期四, 2018年 8 月 23日 下午 7:22:52 主题: Re: [stschiff/msmc] MSMC error (#40)

Hi,

what's not to understand about these error messages? The first one suggests that you are using an index number in the -I flag that exceeds the number of haplotypes that are actually in your file. Note that indexing here is 0-based. The second error suggests that you have put multiple chromosome into one file, which is not allowed. You need to separate chromosomes into separate files and give all those files to MSMC on the command line.

Best, Stephan

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub , or mute the thread .


本邮件及其附件含有发送给特定个人和用于特定目的的保密信息。如果您不是预期的收件人,请立即删除本邮件并通知发件人。严禁任何非预期的收件人使用、传播、分发或复制本邮件或其附件。 This email and its attachments may contain confidential information intended for a specific individual and purpose. If you are not the intended recipient, you should delete this email and notify the sender immediately. Any use, dissemination, distribution, or copying of this email or its attachments by persons other than the intended recipient(s), is strictly prohibited.

zillurbmb51 commented 5 years ago

Hello there, I am having almost same problem here. In my case I have separate file for each chromosome. Whenever, I tried to put r and I flag I am getting the following error `core.exception.RangeError@model/data.d(188): Range violation

??:? _d_arrayboundsp [0x5b5ad2]`
Any help? My sample represents a single population. Should I use I 0,1,4,5 ? Whatever value I put for r and I, I am getting the same error. Without r and I the program runs without any error. Best regards Zillur

kentsing commented 5 years ago

maybe you should remove the indels and just keep the SNPs.

----- 原始邮件 ----- 发件人: "zillurbmb51" notifications@github.com 收件人: "stschiff/msmc" msmc@noreply.github.com 抄送: "kentsing" xingk@mail.sysu.edu.cn, "Author" author@noreply.github.com 发送时间: 星期二, 2019年 4 月 02日 上午 8:16:15 主题: Re: [stschiff/msmc] MSMC error (#40)

Hello there, I am having almost same problem here. In my case I have separate file for each chromosome. Whenever, I tried to put r and I flag I am getting the following error `core.exception.RangeError@model/data.d(188): Range violation

??:? _d_arrayboundsp [0x5b5ad2]` Any help? My sample represents a single population. Should I use I 0,1,4,5 ? Whatever value I put for r and I, I am getting the same error. Without r and I the program runs without any error. Best regards Zillur

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub , or mute the thread .


本邮件及其附件含有发送给特定个人和用于特定目的的保密信息。如果您不是预期的收件人,请立即删除本邮件并通知发件人。严禁任何非预期的收件人使用、传播、分发或复制本邮件或其附件。 This email and its attachments may contain confidential information intended for a specific individual and purpose. If you are not the intended recipient, you should delete this email and notify the sender immediately. Any use, dissemination, distribution, or copying of this email or its attachments by persons other than the intended recipient(s), is strictly prohibited.

zillurbmb51 commented 5 years ago

Thank you very much. I have only snps in my data, no indels. However, if I put "I 0,1" it runs without any error. How can I identify which value I need for my dataset? Best regards Zillur

stschiff commented 5 years ago

Hi, please send me a small dataset and a command line that reproducibly gives this error.

Thanks, Stephan

On 2 Apr 2019, at 16:03, zillurbmb51 notifications@github.com wrote:

Thank you very much. I have only snps in my data, no indels. However, if I put "I 0,1" it runs without any error. How can I identify which value I need for my dataset? Best regards Zillur

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stschiff/msmc/issues/40#issuecomment-479010435, or mute the thread https://github.com/notifications/unsubscribe-auth/AAbQmluw11K7c7JTGOd3FhEotC1Chv7Aks5vc2M1gaJpZM4WJGsC.

zillurbmb51 commented 5 years ago

Thank you very much. My command line was :

samtools mpileup -q 20 -Q 20 -C 50 -u -r Pf3D7_01_v3 -f pfal.fa SRR3305693_NoDup_addrplced.bam | bcftools call -c -V indels | /gondor/zillur/thesis/psmc/msmc-tools/bamCaller.py 331.781 chr1.mask.bed.gz | gzip -c > chr1.vcf.gz
/gondor/zillur/thesis/psmc/msmc-tools/generate_multihetsep.py --mask=chr1.mask.bed.gz chr1.vcf.gz > chr1.msmc.input.txt
/gondor/zillur/msmc2/build/release/msmc2 -t 16 -r 0.0115 -I 0,1,4,5 -o test1 chr1.msmc.input.txt

If I run without the "I" flag or "I 0,1" it works.

Just to clarify, the value of r should be mutation rate/recombination rate, right? Best regards Zillur chr1.mask.bed.gz chr1.msmc.input.txt chr1.vcf.gz

rillaxy commented 5 years ago

Thank you very much. My command line was :

samtools mpileup -q 20 -Q 20 -C 50 -u -r Pf3D7_01_v3 -f pfal.fa SRR3305693_NoDup_addrplced.bam | bcftools call -c -V indels | /gondor/zillur/thesis/psmc/msmc-tools/bamCaller.py 331.781 chr1.mask.bed.gz | gzip -c > chr1.vcf.gz
/gondor/zillur/thesis/psmc/msmc-tools/generate_multihetsep.py --mask=chr1.mask.bed.gz chr1.vcf.gz > chr1.msmc.input.txt
/gondor/zillur/msmc2/build/release/msmc2 -t 16 -r 0.0115 -I 0,1,4,5 -o test1 chr1.msmc.input.txt

If I run without the "I" flag or "I 0,1" it works.

Just to clarify, the value of r should be mutation rate/recombination rate, right? Best regards Zillur chr1.mask.bed.gz chr1.msmc.input.txt chr1.vcf.gz

Thank you very much. I have only snps in my data, no indels. However, if I put "I 0,1" it runs without any error. How can I identify which value I need for my dataset? Best regards Zillur

Hi ,Zillur ,

Thank you very much. I have only snps in my data, no indels. However, if I put "I 0,1" it runs without any error. How can I identify which value I need for my dataset? Best regards Zillur Hi, Zillur ,I am getting the same error with you.How do you solve this problem?

Thank you very much. I have only snps in my data, no indels. However, if I put "I 0,1" it runs without any error. How can I identify which value I need for my dataset? Best regards Zillur

Hi, Zillur ,I am getting the same error with you.How do you solve this problem? Looking forward to hearing from you .Thank you. Rilla

zillurbmb51 commented 5 years ago

Hello Rilla, could share some part of your chr*.msmc.input.txt. Did you try with/without 'I' flag? Running without '-I' was OK for me. Best, Zillur

rillaxy commented 5 years ago

Hello Rilla, could share some part of your chr*.msmc.input.txt. Did you try with/without 'I' flag? Running without '-I' was OK for me. Best, Zillur

Hello Zillur, I send some part of my chr*.msmc.input.txt to you . This is no phased diploid genomes. The probelm is itcan run with 'I' flag if I put "I 0,1" or without the 'I' flag.But it can't run if I use I 0,1,4,5. Why ? Thanks for your reply. Below is my command and wrong information. Looking forward hearing from you. Rilla

msmc2 -r 0.00157 -I 0,1 -o msmc.test msmc-input-chr*(success) msmc2 -r 0.00157 -o msmc.test msmc-input-chr*(success) msmc2 -r 0.00157 -I 0,1,4,5 -o msmc.test msmc-input-chr*(wrong)

core.exception.RangeError@model/data.d(188): Range violation

??:? _d_arrayboundsp [0x5cfe25] ??:? bool model.data.has_missing_data(const(char[][]), ulong[2]) [0x560f44] ??:? model.data.SegSite_t[][] model.data.readSegSites(immutable(char)[], ulong[2][], bool) [0x560804] ??:? model.data.SegSite_t[][] msmc2.readDataFromFiles(immutable(char)[][], ulong[2][], bool) [0x585295] ??:? void msmc2.parseCommandLine(immutable(char)[][]) [0x5837e2] ??:? _Dmain [0x5830f7]

msmc-input-chr01.txt msmc-input-chr02.txt msmc-input-chr03.txt

zillurbmb51 commented 5 years ago

Hello Rilla, I guess you have only two character in the snp column. AT/CG etc. Maybe that's why it wotk on 0,1 but not more that 1. Best, Zillur

rillaxy commented 5 years ago

Hello Rilla, I guess you have only two character in the snp column. AT/CG etc. Maybe that's why it wotk on 0,1 but not more that 1. Best, Zillur

Hello Zillur, I get it. Thank you very much. Best, Rilla