jiabowang / GAPIT

Genome Association Predict Integrate Tools
161 stars 52 forks source link

Gapit error at "Joining tvalue and stderr" #92

Open linsson opened 9 months ago

linsson commented 9 months ago

Hi, I am trying to run GAPIT using a set of 85k SV for about 350 accessions. I am using the following to import gapit

suppressMessages(source(pipe(paste0("wget -O - --no-check-certificate http://www.zzlab.net/GAPIT/gapit_functions.txt"))))

and I am running gapit using:

GAPIT( Y=myY[,c(1,2)], GD=myGD, GM=myGM, G=NULL, CV=NULL, Model.selection=TRUE, model="Blink", Multiple_analysis=FALSE, Geno.View.output=FALSE)

Gapit returns:

[1] "--------------------- Welcome to GAPIT ----------------------------" [1] "Blink" [1] "--------------------Processing traits----------------------------------" [1] "Phenotype provided!" [1] "The 1 model in all." [1] "BLINK" [1] "GAPIT.DP in process..." [1] "GAPIT will filter marker with MAF setting !!" [1] "The markers will be filtered by SNP.MAF: 0" maf_index TRUE 73696 [1] "Calculating kinship..." [1] "Number of individuals and SNPs are 353 and 85468" [1] "Kinship created!" NULL [1] "GAPIT.DP accomplished successfully for multiple traits. Results are saved" [1] "Processing trait: Al" [1] "GAPIT.Phenotype.View in press..." [1] "GAPIT.Phenotype.View output pdf has been generated successfully!" [1] "--------------------Phenotype and Genotype ----------------------------------" [1] "BLINK" [1] TRUE [1] "There are 1 traits in phenotype data." [1] "There are 350 individuals in phenotype data." [1] "There are 85468 markers in genotype data." [1] "Phenotype and Genotype are test OK !!" [1] "--------------------GAPIT Logical Done----------------------------------" [1] "GAPIT.IC in process..." [1] "There is 0 Covarinces." [1] "There are 345 common individuals in genotype , phenotype and CV files." [1] "The dimension of total CV is " [1] 345 2 [1] "GAPIT.IC accomplished successfully for multiple traits. Results are saved" [1] "GAPIT.SS in process..." [1] "GAPIT will be into GWAS approach..." [1] "BLINK" [1] "The GAPIT would go into Bus..." [1] "----------------------Welcome to Blink----------------------" [1] "----------------------Iteration: 1 ----------------------" [1] "seqQTN:" NULL [1] "----------------------Iteration: 2 ----------------------" [1] "Top snps have little effect, set seqQTN to NULL!" [1] "seqQTN is:,stop here" [1] "LD.time(sec):" [1] 0 0 [1] "BIC.time(sec):" [1] 0 0 [1] "GLM.time(sec):" [1] 7.515 0.000 [1] "-------------Blink finished successfully in 10.3 seconds!-----------------" [1] "Calculating Orignal GWAS result..." [1] "GAPIT.RandomModel beginning..." [1] "There is no significant marker for VE !!" [1] "BLINK R is done !!!!!" NULL [1] "GAPIT.ID in process..." [1] "Filtering SNPs with MAF..." [1] "Calculating FDR..." [1] "QQ plot..." [1] "Manhattan plot (Genomewise)..." [1] "GAPIT.Manhattan accomplished successfully!zw" [1] "Manhattan plot (Chromosomewise)..." [1] "GD does not mach GM in Manhattan !!!" [1] "GD does not match GM in Manhattan !!!" [1] "select 0 candidate significont markers in 1 chromosome " [1] "select 0 candidate significont markers in 2 chromosome " [1] "select 0 candidate significont markers in 3 chromosome " [1] "select 0 candidate significont markers in 4 chromosome " [1] "select 0 candidate significont markers in 5 chromosome " [1] "select 0 candidate significont markers in 6 chromosome " [1] "select 0 candidate significont markers in 7 chromosome " [1] "select 0 candidate significont markers in 8 chromosome " [1] "select 0 candidate significont markers in 9 chromosome " [1] "select 0 candidate significont markers in 10 chromosome " [1] "select 0 candidate significont markers in 11 chromosome " [1] "select 0 candidate significont markers in 12 chromosome " [1] "manhattan plot on chromosome finished" [1] "GAPIT.Manhattan accomplished successfully!zw" [1] "Association table..." [1] "Joining tvalue and stderr" Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 138653680, 85468 Calls: eval ... eval -> GAPIT -> GAPIT.ID -> cbind -> cbind -> data.frame Execution halted

The following files are created:

GAPIT.Phenotype.View.Al.pdf GAPIT.Association.QQ.BLINK.Al.pdf GAPIT.Association.Manhattan_Geno.BLINK.Al.pdf GAPIT.Association.Manhattan_Chro.BLINK.Al.pdf GAPIT.Association.GWAS_Results.BLINK.Al.csv

I suspect there is an issue with the reporting, but I'm unable to pinpoint the exact cause and how to solve it. Not sure it's a useful information but there are some repeated positions in GM dataframe. I named them differently using an incremental index but Chromosome and Positions are the same. I am telling you this because I am also worried about the "GD does not mach GM in Manhattan !!!" message.

Thanks a lot

PS. I notice the Google group is experiencing some issues. Do you have any information about whether it will be reopened?

linsson commented 9 months ago

Hi. This is an update.

I tried to remove variants in the same positions (about 10k) but the results were the same. Then after some googling and also having a look to other similar issues, I replaced NA in my GD dataframe with markers' average before running GAPIT. This time the run could finish without errors:

[1] "--------------------- Welcome to GAPIT ----------------------------" [1] "Blink" [1] "--------------------Processing traits----------------------------------" [1] "Phenotype provided!" [1] "The 1 model in all." [1] "BLINK" [1] "GAPIT.DP in process..." [1] "The data set include un-0,1,2 values !!!" [1] "GAPIT will not perform MAF filtering !!!" [1] "Calculating kinship..." [1] "Number of individuals and SNPs are 353 and 85468" [1] "Kinship created!" NULL [1] "GAPIT.DP accomplished successfully for multiple traits. Results are saved" [1] "Processing trait: Al" [1] "GAPIT.Phenotype.View in press..." [1] "GAPIT.Phenotype.View output pdf has been generated successfully!" [1] "--------------------Phenotype and Genotype ----------------------------------" [1] "BLINK" [1] TRUE [1] "There are 1 traits in phenotype data." [1] "There are 350 individuals in phenotype data." [1] "There are 85468 markers in genotype data." [1] "Phenotype and Genotype are test OK !!" [1] "--------------------GAPIT Logical Done----------------------------------" [1] "GAPIT.IC in process..." [1] "There is 0 Covarinces." [1] "There are 345 common individuals in genotype , phenotype and CV files." [1] "The dimension of total CV is " [1] 345 2 [1] "GAPIT.IC accomplished successfully for multiple traits. Results are saved" [1] "GAPIT.SS in process..." [1] "GAPIT will be into GWAS approach..." [1] "BLINK" [1] "The GAPIT would go into Bus..." [1] "----------------------Welcome to Blink----------------------" [1] "----------------------Iteration: 1 ----------------------" [1] "seqQTN:" NULL [1] "----------------------Iteration: 2 ----------------------" [1] "Top snps have little effect, set seqQTN to NULL!" [1] "seqQTN is:,stop here" [1] "LD.time(sec):" [1] 0 0 [1] "BIC.time(sec):" [1] 0 0 [1] "GLM.time(sec):" [1] 9.296 0.000 [1] "-------------Blink finished successfully in 12.24 seconds!-----------------" [1] "Calculating Orignal GWAS result..." [1] "GAPIT.RandomModel beginning..." [1] "There is no significant marker for VE !!" [1] "BLINK R is done !!!!!" NULL [1] "GAPIT.ID in process..." [1] "Filtering SNPs with MAF..." [1] "Calculating FDR..." [1] "QQ plot..." [1] "Manhattan plot (Genomewise)..." [1] "GAPIT.Manhattan accomplished successfully!zw" [1] "Manhattan plot (Chromosomewise)..." [1] "select 0 candidate significont markers in 1 chromosome " [1] "select 0 candidate significont markers in 2 chromosome " [1] "select 0 candidate significont markers in 3 chromosome " [1] "select 0 candidate significont markers in 4 chromosome " [1] "select 0 candidate significont markers in 5 chromosome " [1] "select 0 candidate significont markers in 6 chromosome " [1] "select 0 candidate significont markers in 7 chromosome " [1] "select 0 candidate significont markers in 8 chromosome " [1] "select 0 candidate significont markers in 9 chromosome " [1] "select 0 candidate significont markers in 10 chromosome " [1] "select 0 candidate significont markers in 11 chromosome " [1] "select 0 candidate significont markers in 12 chromosome " [1] "manhattan plot on chromosome finished" [1] "GAPIT.Manhattan accomplished successfully!zw" [1] "Association table..." [1] "Joining tvalue and stderr" [1] "GAPIT Phenotype distribution with significant markers in process..." [1] FALSE [1] 0 8 [1] "GAPIT.ID accomplished successfully for multiple traits. Results are saved" [1] "GAPIT accomplished successfully for multiple traits. Result are saved" [1] "GAPIT has done all analysis!!!" [1] "Please find your all results in :" [1] "/data00/g2psol/tomato/2023-06-08_gwas_refining_algorithm/g2psol-sv/LD-1/MAF-0.01/PCA-auto_TRAIT-no-outl_mean_norm_TLOC-none_COV-none-none_CLOC-none/Al"

I am still a little bit worried about some warnings:

[1] "The data set include un-0,1,2 values !!!" [1] "GAPIT will not perform MAF filtering !!!"

even if my GD is aready filtered by MAF (>=0.01) so that missing filtering maybe is not be a problem. What do you think? I am also asking whether this "mean imputation" is theoretically sound. I remind this is just a structural variations GD.

Thanks a lot