esrud / GONE

GONE: Scripts, programs and an example data set
42 stars 2 forks source link

Unable to run GONe with >70.000 SNPs pr chromosome #40

Open auNathalie opened 7 months ago

auNathalie commented 7 months ago

Dear Armando/GONe team (yet again šŸ˜Š),

Iā€™m having an issue trying to run GONe with above 70.000 SNPs pr. chromosome.

GONe runs perfectly on my ped and map files when I set maxNSNP=70000 (or <70000). The sample size (individuals; corrected for zeroes) = 10.000000 in the OUTPUT_file. (NIND(real sample)=10)

When the parameter maxNSNP is increased to 75000, 80000, 90000 in the input_parameters file: The ā€œINPUT FOR GONE sectionā€ of the OUTPUT_file is created, however the sample size (individuals; corrected for zeroes) is set as 0.000000. (Giving more memory does not seem to help. ) -> either causing the error: "Specify a sample size larger than 1" in the .out or no errors in the .out but it will not produce a Ne nor a d2 file. I should say that the max no. of SNPs on a CHR in my samples is approx. 114.000 SNPs.

I will put OUTPUT_file examples in at the end.

ā€¢ Do you know what could be causing this ? ā€¢ And/Or potentially any way to fix it?

Otherwise, I will be satisfied with running a maximum of 70.000 SNPs.

Thank you for creating this awesome tool. And thank you for all your help.

Best regards, Nathalie Ibsen

Here are two examples:


maxNSNP=70000


CHROMOSOME 1 NIND(real sample)=10 NSNP=69757 NSNPcalculations=64152 NSNP+2alleles=0 NSNP_zeroes=0 NSNP_monomorphic=5605 NIND_corrected=10.000000 freq_MAF=0.050000 F_dev_HW (sample)=-0.054569 F_dev_HW (pop)=-0.001943 No genetic distances; using 6.100000 cM per Mb

(ā€¦ā€¦) INPUT FOR GONE

2 Phase (0: pseudohaploids; 1: known phase; 2: unknown phase) 10.000000 sample size (individuals; corrected for zeroes)


maxNSNP=75000


CHROMOSOME 1 NIND(real sample)=10 NSNP=74712 NSNPcalculations=68690 NSNP+2alleles=0 NSNP_zeroes=0 NSNP_monomorphic=6022 NIND_corrected=-8.205774 freq_MAF=0.050000 F_dev_HW (sample)=-0.054474 F_dev_HW (pop)=-0.001847 No genetic distances; using 6.100000 cM per Mb

(ā€¦ā€¦) INPUT FOR GONE

2 Phase (0: pseudohaploids; 1: known phase; 2: unknown phase) 0.000000 sample size (individuals; corrected for zeroes)

armando-caballero commented 7 months ago

Nathalie, It may be a problem of matrix sizes, etc. But you do not need at all more than 50000 SNPs for Ne estimation. Be happy with 50,000 ... Armando.

auNathalie commented 7 months ago

Thank you so much for your quick reply Armando. I'll be happy with it.

Best, Nathalie