Open shameem356 opened 3 years ago
Hello,
You can simply use read.table
to read your pc_scf.txt
, and then use the model.matrix
function to encode them, you can refer to the following code:
cv <- model.matrix(~as.numeric(pc1)+as.numeric(pc2)+as.numeric(pc3)+as.numeric(pc4)+as.numeric(pc5)+as.factor(scf1)+as.factor(scf2)+as.factor(scf3), data=pc_scf)
MVP(..., CV.GLM=cv, CV.MLM=cv, CV.FarmCPU=cv, nPC.GLM=0, nPC.MLM=0, nPC.FarmCPU=0, ...)
when you have calculated the PCs and put them into the CV.<model>
parameter of the model, please set nPC.<model>
to 0 to prevent the MVP from automatically adding PCs. MVP.Data.PC
is used for principal component analysis, and its role is to obtain PCs from genotypes.
hello @hyacz , Thank you so much for your quick reply and code.I have updated my code as below based on your suggestion. Looking forward to see your suggestion.
library(rMVP) MVP.Data(fileBed="199sample_HF", filePhe=NULL, fileKin=TRUE, filePC=FALSE,
out="mvp.199sample_HF" )
genotype <- attach.big.matrix("mvp.199sample_HF.geno.desc") phenotype <- read.table("179s_pheno.csv",head=TRUE) map <- read.table("mvp.199sample_HF.geno.map" , head = TRUE) Kinship <- attach.big.matrix("mvp.199sample_HF.kin.desc") pc_scf<- read.table("179s_PC5_scf_for_mvp.csv",head=TRUE) cv <- model.matrix(~as.numeric(PC1)+as.numeric(PC2)+as.numeric(PC3)+as.numeric(PC4)+as.numeric(PC5)+as.factor(SCF_Red)+as.factor(SCF_Green)+as.factor(SCF_Blue), data=pc_scf)
for(i in 2:ncol(phenotype)){ imMVP <- MVP( phe=phenotype[, c(1, i)], geno=genotype, map=map, K=Kinship, CV.FarmCPU=cv, nPC.FarmCPU=0, priority="speed", ncpus=16, vc.method="BRENT", maxLoop=10, method.bin="FaST-LMM",
#permutation.rep=100,
threshold=0.05,
method=c("FarmCPU")
) gc() }
@hyacz ,
By running the above code , the log file is showing that 'Number of provided covariates of FarmCPU: 540'. 179s_PC5_scf_for_mvp.csv is having 179 samples ( 5 pcs+ 3 scaling factor value, 172* 8=1432 values ). I would like to know why 'Number of provided covariates of FarmCPU' is showing 540 ?
Then the number of covariates mentioned in the log depends on the number of columns of variable cv
. There are 3 factors (SCF_Red, SCF_Green, SCF_Blue). Since they have multiple levels, after processing by the model.matrix
function, the number of columns in cv
will be 540.
I'm not sure if I understand your data correctly. If SCF is a categorical variable, this is ok. If SCF is a quantitative variable, then as.numeric(SCF)
should be used instead of as.factor(SCF)
in model.matrix
.
in addition, it should be noted that the order of individuals in cv needs to be consistent with the phenotype and genotype.
Hello Team rMVP,
First of all thank you so much for your wonderful software. I would like to clarify some doubt regarding multiple covariates. I have Five PCs in PC.txt file (pc1, pc2, pc3, pc4, pc5)and three scaling factor value in scf.txt file (scf1, scf2, scf3). Can I combine these two file into a single file (pc_scf.txt) and load as covaries file using below command ? if no, how can Iuse PC.txt and scf.txt file as covariates file ?
note : pc_scf.txt file is having 8 column ( pc1, pc2, pc3, pc4, pc5, scf1, scf2, scf3 ) MVP.Data.PC("pc_scf.txt", out="mvp.pc_scf",sep='\t') Covariates_PC <- bigmemory::as.matrix(attach.big.matrix("mvp.pc_scf"))