Open elizeng opened 1 year ago
If you show me the input files I may tell you what is not going well. Best wishes, Armando.
Try clearing your folder of old output files from GONE. It seems to have errors when certain old files are still around.
Hi @armando-caballero. I am having a similar issue. I also changed the chromosome names, but now am getting a similar error that GONE
runs into a format error of outfileLD.
I generated my PLINK files using VCFTools, and changed the chromosome names into a sequence from 1 to the number of "chromosomes" (= scaffolds). There are ~10K SNPs from 322 individuals on 42 chromosomes.
Any suggestions?
@elizeng were you able to resolve your issue?
Here's my error information:
DIVIDE .ped AND .map FILES IN CHROMOSOMES
RUNNING ANALYSIS OF CHROMOSOMES ...
CHROMOSOME ANALYSES took 0 seconds
Running GONE
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Format error in file outfileLD
Error opening file outfileLD_TEMP/outfileLD_1_GONE_Nebest
END OF ALL PROCESSES
GONE run took 0 seconds
END OF ANALYSES
mv: cannot stat 'outfileLD_Ne_estimates': No such file or directory
mv: cannot stat 'outfileLD_d2_sample': No such file or directory
I cannot tell you unless I see your input files. If you want to send them to me (better a small sample with the same problem) I will have a look. Armando.
Hi Armando. They're attached in the previous comment as example-data.zip. Also here: example-data.zip
There are two issues: (1) Your ped file should be: LICA026001 LICA026001 0 0 0 -9 A A .... (2) Even so does not run. And the reason is that you have too many zeros. If more that half the individuals have missing data, that SNP is not considered and this happens with all of them. For example, in the PARAMETERS file you can see for chhromosome 1 that from 1293 SNPs all are considered as zeroes (more than half of the individuals have missing data):
CHROMOSOME 1 NIND(real sample)=322 NSNP=1293 NSNPcalculations=0 NSNP+2alleles=0 NSNP_zeroes=1293 NSNP_monomorphic=0
etc. for all chromosomes
Armando
Thanks Armando, that got it working! From this postabout RADseq data, I had included most of my SNPs, but I see now that I need to limit them somewhat because of missing data. After subsetting a bit, GONE seems to work!
However, I still seem to get the large spike in Ne around 180 or so generations. This occurs even if I decrease hc to 0.025. Do you have any suggestions for how to evaluate the efficacy of the runs?
I'm running each subpopulation of the species separately, so there shouldn't be strong population structure among the individuals. There might be some inbreeding though. Might that be causing the observed crash and then rebound in Ne? I would expect a crash without recovery sometime between 10 and ~100 generations generations ago.
Here's the output with hc = 0.025. It is very similar to hc = 0.05
It looks ugly. I cannot tell you. The limiting time for reliability should be no more than 200 generations. 180 is very close to that. Perhaphs you should only focus on the last 50 generations or so. But the sharp decline in the last 4 generations can be real or an artefact of sampling, not necessarily substructure in the population but also a non-random sampling .... Armando.
Hi @armando-caballero. I am having a similar issue. I also changed the chromosome names, but now am getting a similar error that
GONE
runs into a format error of outfileLD.I generated my PLINK files using VCFTools, and changed the chromosome names into a sequence from 1 to the number of "chromosomes" (= scaffolds). There are ~10K SNPs from 322 individuals on 42 chromosomes.
Any suggestions?
@elizeng were you able to resolve your issue?
Here's my error information:
DIVIDE .ped AND .map FILES IN CHROMOSOMES RUNNING ANALYSIS OF CHROMOSOMES ... CHROMOSOME ANALYSES took 0 seconds Running GONE Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Error opening file outfileLD_TEMP/outfileLD_1_GONE_Nebest END OF ALL PROCESSES GONE run took 0 seconds END OF ANALYSES mv: cannot stat 'outfileLD_Ne_estimates': No such file or directory mv: cannot stat 'outfileLD_d2_sample': No such file or directory
hi @alexkrohn @armando-caballeroi have the same issue with too many zeros in my example.ped. but i don't known how to solve this problem. should i delect all the zero? Here's my error informatiom:
DIVIDE` .ped AND .map FILES IN CHROMOSOMES RUNNING ANALYSIS OF CHROMOSOMES ... CHROMOSOME ANALYSES took 22 seconds Running GONE Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file Format error in file outfileLD outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Format error in file outfileLD Error opening file outfileLD_TEMP/outfileLD_1_GONE_Nebest END OF ALL PROCESSES GONE run took 0 seconds END OF ANALYSES mv: cannot stat 鈕utfileLD_Ne_estimates? No such file or directory mv: cannot stat 鈕utfileLD_d2_sample? No such file or directory
I only can reply you if I see an example of your files. best wishes, Armando.
Armando.
RESPONSE to xiaocanna notifications@github.com
There are chromosomes with one or two SNPs only, or a few. If you run say the first 10 chromosomes, it works, though the results are Ne of millions.
maxNCHROM=10 ### Maximum number of chromosomes to be analysed (-99 = all chromosomes)
Armando.
I am trying to run GONE with 10k and 100k SNPs for a non-human dataset, but seem to be getting the following error.
The example dataset works fine. The only difference I can find is that the example dataset has chromosome numbers with just 1 number in the map file:
While my map file has chromosome numbers in a longer string. Is this potentially a problem? Would it be possible to accommodate more values in the chromosome column?
If anyone is able to suggest how I can rename the chromosome numbers without the whole string, I would be very thankful!
Much appreciated!
Edit: I used awk to change the chromosome names to just numbers within 200, and it now does not have the error that the chromosome can't be read. Although I am met with a new error, while running it on 10k SNPs and just 25 chromosomes that the outfiles can't be read.