jiabowang / GAPIT

Genome Association Predict Integrate Tools
177 stars 55 forks source link

HapMap format #160

Open Lamoumni18 opened 2 weeks ago

Lamoumni18 commented 2 weeks ago

First I would like to point out that in page 6 of the GAPIT manual, it is mentionned that "GAPIT accepts multiple input data formats, including both numeric, hapmap, and PLINK genotype formats". However, later on we find that only HapMap and numeric formats are accepted, I don't know if I'm missing something here, but just to mention it. Now, I converted my data to HapMap format, upload it successfully, and checked the format, which was similar to the example shown in the manual. However, upon running an initial analysis, I encountered the following error, which I'm still unable to overcome:

Test_1 <- GAPIT( Y = myphenox,
                G = mygeno0,
                PCA.total = 3,
                model = "MLM",
                KI = K_WG)
[1] "--------------------- Welcome to GAPIT ----------------------------"
[1] "MLM"
[1] "--------------------Processing traits----------------------------------"
[1] "Phenotype provided!"
[1] "The 1 model in all."
[1] "MLM"
[1] "GAPIT.DP in process..."
[1] "Converting genotype..."
[1] "Converting HapMap format to numerical under model of Middle"
[1] "Perform numericalization"
[1] "Succesfuly finished converting HapMap which has bits of 1"
[1] "Converting genotype done."
[1] "GAPIT will filter marker with MAF setting !!"
[1] "The markers will be filtered by SNP.MAF: 0"
maf_index
  TRUE 
555646 
[1] "Plotting Kinship"
[1] "Creating heat map for kinship..."
[1] "Kinship heat map PDF created!"
[1] "Kinship NJ TREE PDF created!"
[1] "Calling prcomp..."
[1] "Creating PCA graphs..."
[1] "Joining taxa..."
[1] "Exporting PCs..."
[1] "PC created"
[1] "Filting marker for GAPIT.Genotype.View function ..."
[1] "The average distance between markers are ..."
[1] 0.013 0.039 0.013 0.079    NA 0.061
[1] "GAPIT.Genotype.View . pdfs generate.successfully!"
[1] 257   4
[1] "GAPIT.DP accomplished successfully for multiple traits. Results are saved"
[1] "Processing trait: FFD_adj_BLUPs"
[1] "GAPIT.Phenotype.View in press..."
[1] "GAPIT.Phenotype.View output pdf has been generated successfully!"
[1] "--------------------Phenotype and Genotype ----------------------------------"
[1] "Zhang"
[1] TRUE
[1] "There are  1  traits in phenotype data."
[1] "There are  257  individuals in phenotype data."
[1] "There are  555646  markers in genotype data."
[1] "Phenotype and Genotype are test OK !!"
[1] "--------------------GAPIT Logical Done----------------------------------"
[1] "GAPIT.IC in process..."
[1] "There is 0 Covarinces."
[1] "There are 0 common individuals in genotype , phenotype and CV files."
[1] "The dimension of total CV is "
[1] 0 4
[1] "GAPIT.IC accomplished successfully for multiple traits. Results are saved"
[1] "GAPIT.SS in process..."
[1] "GAPIT will be into GWAS approach..."
Error in data.frame(..., check.names = FALSE) : 
  arguments imply differing number of rows: 1, 0
In addition: There were 50 or more warnings (use warnings() to see the first 50)
warnings()
Warning messages:
1: 'memory.size()' is no longer supported
2: 'memory.size()' is no longer supported
3: 'memory.size()' is no longer supported
4: 'memory.size()' is no longer supported
5: 'memory.size()' is no longer supported
6: 'memory.size()' is no longer supported
7: 'memory.size()' is no longer supported
8: In order(as.numeric(as.character(chor_taxa))) : NAs introduced by coercion
9: 'memory.size()' is no longer supported
10: 'memory.size()' is no longer supported
11: 'memory.size()' is no longer supported
12: 'memory.size()' is no longer supported
13: 'memory.size()' is no longer supported
14: 'memory.size()' is no longer supported
15: 'memory.size()' is no longer supported
16: 'memory.size()' is no longer supported
17: 'memory.size()' is no longer supported
18: 'memory.size()' is no longer supported
19: In cor(a, b) : the standard deviation is zero
20: In cor(a, b) : the standard deviation is zero
21: In cor(a, b) : the standard deviation is zero
22: In cor(a, b) : the standard deviation is zero
23: In cor(a, b) : the standard deviation is zero
24: In cor(a, b) : the standard deviation is zero
25: In cor(a, b) : the standard deviation is zero
26: In cor(a, b) : the standard deviation is zero
27: In cor(a, b) : the standard deviation is zero
28: In cor(a, b) : the standard deviation is zero
29: In cor(a, b) : the standard deviation is zero
30: In cor(a, b) : the standard deviation is zero
31: In cor(a, b) : the standard deviation is zero
32: In cor(a, b) : the standard deviation is zero
33: In cor(a, b) : the standard deviation is zero
34: In cor(a, b) : the standard deviation is zero
35: In cor(a, b) : the standard deviation is zero
36: In cor(a, b) : the standard deviation is zero
37: In cor(a, b) : the standard deviation is zero
38: In cor(a, b) : the standard deviation is zero
39: In cor(a, b) : the standard deviation is zero
40: In cor(a, b) : the standard deviation is zero
41: In cor(a, b) : the standard deviation is zero
42: In cor(a, b) : the standard deviation is zero
43: In cor(a, b) : the standard deviation is zero
44: In cor(a, b) : the standard deviation is zero
45: In cor(a, b) : the standard deviation is zero
46: In cor(a, b) : the standard deviation is zero
47: In cor(a, b) : the standard deviation is zero
48: In cor(a, b) : the standard deviation is zero
49: In cor(a, b) : the standard deviation is zero
50: In cor(a, b) : the standard deviation is zero
jiabowang commented 1 week ago

GAPIT report "There are 0 common individuals in genotype , phenotype and CV files." Please check taxa of your genotype and phenotype files.

Lamoumni18 commented 1 week ago

Well, first I've used pheno and geno datafiles that were already used with sucess in other packages. Moreover, in order to assure that I'll avoid such problems, I used the following simple function:

colnames(mygeno)[12:268] <- mypheno$taxa

And then, I've even went further...

to_check <- data.frame(pheno = mypheno$taxa, geno = colnames(mygeno)[12:268])
to_check$Check <- to_check$pheno == to_check$genono
sum(to_check$Check == FALSE)
[1] 0

So, I don't know exactly why it consider that no common individuals are found

jiabowang commented 1 week ago

Hi, the HapMap read into GAPIT should not have header. The first row should be the title of each column.