esrud / GONE

GONE: Scripts, programs and an example data set
42 stars 3 forks source link

Erroe about LD calculation #28

Open liuze955 opened 1 year ago

liuze955 commented 1 year ago

Hi, I am tring to use Gone to estimate recent demographic history about sheep.

my input file is ped and map file,which ontains approximately 20 million loci. The input file is as follows. For ease of reading, the line spacing is relatively large, but there are no blank lines in the file.

`OL.ped D141010046 D141010046 0 0 0 -9 A A T T T T C C

L11ci L11ci 0 0 0 -9 A A T T T T C C A A G G G A G

L11xiong L11xiong 0 0 0 -9 A A T T T T C C A A G ......`

`OL.map 1 1_7757 0 7757

1 1_7761 0 7761

1 1_7874 0 7874

1 1_7880 0 7880

1 1_7997 0 7997 ... `

my INPUT_PARAMETERS_FILE are all default parameters,the command I submitted is sh script_GONE.sh OL,. here is my operation log: `DIVIDE .ped AND .map FILES IN CHROMOSOMES

RUNNING ANALYSIS OF CHROMOSOMES ...

xargs: bash: terminated by signal 11

xargs: bash: terminated by signal 11

CHROMOSOME ANALYSES took 1 seconds

cat: outfileLD11: No such file or directory

sed: can't read outfileLD11: No such file or directory

cat: parameters11: No such file or directory

cat: outfileLD12: No such file or directory

sed: can't read outfileLD12: No such file or directory

cat: parameters12: No such file or directory`

When I go to TEMPORARY_ From the FILES folder, it can be seen that there are ped and map files for each chromosome, but outfileLD * file size is 0. how can i solve this problem?

Thank you very much!

liuze955 commented 1 year ago

Additionally, when I run the example file with the same parameters, I can obtain results normally

armando-caballero commented 1 year ago

As indicated in the tutorial: Do not try to run data sets with more than, say, ten million SNPs (the maximum number of SNPs per chromosome is 1 million), as the software may crash, and the recommended maximum number of SNPs used per chromosome in each analysis is 50,000. Best wishes, Armando.

liuze955 commented 1 year ago

As indicated in the tutorial: Do not try to run data sets with more than, say, ten million SNPs (the maximum number of SNPs per chromosome is 1 million), as the software may crash, and the recommended maximum number of SNPs used per chromosome in each analysis is 50,000. Best wishes, Armando.

Hi,Armando Thanks for your reply. In the tutorial:If the value is larger, however, a random sample of 50,000 SNPs will be used. I thought the software would automatically sample sites.

My data has been filtered for missing rates and maf. If I want to reduce a large number of sites, I can only perform LD prune. When I used Plink to set a threshold of 0.2 for LD prune, the OL population retained 3 million sites, and then GONE did run normally.

However, GONE is calculated based on LD, will LD pruning affect the results? Do you have any suggestions to filter the data? Sincerely, Liuze.

armando-caballero commented 1 year ago

GONE makes a random samples of 50000 SNPs per chromsome, but this is independent of the fact that the total number of SNPs per chromosome cannot be larger than 1M and the total number of SNPs cannot be larger than 10M. Do not prune by LD, just remove SNPs at random from your extensive data. Armando.

liuze955 commented 1 year ago

Thank you for your suggestion, it is very helpful to me.