Closed jmal0403 closed 6 years ago
Hi John,
I've simulated a data set of the same size as yours and ran Matrix eQTL on it.
It ran without a problem.
library("MatrixEQTL");
nsamp = 500;
ngene = 30000;
nsnps = 600000;
### Load gene expression info
genemat = matrix(runif(ngene*nsamp), ncol = nsamp)
gene = SlicedData$new(genemat);
gene$ResliceCombined(sliceSize = 5000)
### Load covariates
cvrtmat = matrix(runif(20*nsamp), ncol = nsamp)
cvrt = SlicedData$new(cvrtmat);
### Load genotype info
snpsmat = matrix(runif(nsnps*nsamp), ncol = nsamp)
snps = SlicedData$new(snpsmat);
snps$ResliceCombined(sliceSize = 5000)
### Run Matrix eQTL
me = Matrix_eQTL_main(
snps = snps,
gene = gene,
cvrt = cvrt,
output_file_name = NULL,
pvOutputThreshold = 1e-6,
useModel = modelLINEAR,
errorCovariance = numeric(),
verbose = TRUE,
pvalue.hist = FALSE);
The output:
# Processing covariates
# Task finished in 0.02 seconds
# Processing gene expression data (imputation, residualization)
# Task finished in 0.76 seconds
# Creating output file(s)
# Task finished in 0 seconds
# Performing eQTL analysis
# 0.13% done, 25 eQTLs
# ...
# 100.00% done, 18,817 eQTLs
# Task finished in 492.86 seconds
John,
Can you show me the code and full output with the error message? How many significant eQTLs do you get in the process?
Andrey
Andrey,
First, thanks for getting back to me and I apologize for the delay. This is a busy time of year for me. Great new! It's working now. Let me show you what I was doing and what changed:
snps = SlicedData$new(SNPs); snps$fileSliceSize = 1000; snps$fileOmitCharacters = "NA";
genes = SlicedData$new(datExpr.mat); genes$fileOmitCharacters = "NA"; genes$fileSliceSize = 1000;
So, I just changed the following and reran: genes$ResliceCombined(sliceSize = 5000) snps$ResliceCombined(sliceSize = 5000)
So, was I using the wrong fileSlice() attribute?
Thanks so much! John
Writing the line
genes$fileSliceSize = 1000;
does not 'reslice' the matrix. You had to call $ResliceCombined() explicitly.
I see. Thanks so much for your help. Be well, John
Hello,
I've had a lot of success running MatrixEQTL on fairly large datasets in the past. Currently, I'm attempting to run "Matrix_eQTL_engine" for ~600k SNPs and ~30k genes with 500 subjects (CentoOS with 256GB RAM). I did read your thread with @AndrewSkelton, and my first reaction was to lower the p-value. I tried lowering to 1e-4,1e-6, and 1e-9; however, I still get the following error: long vectors not supported yet: ../../src/include/Rinlinedfuns.h:138. I also reduced my dataset down to 100k SNPs and 20k genes and that ran with no issues. So, I'm really not sure how to proceed.
Please let me know if I can provide any additional information. Any help is greatly appreciated.
Best, John
Here is the sessionInfo:
Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core)
Matrix products: default BLAS: /usr/local/lib64/R/lib/libRblas.so LAPACK: /usr/local/lib64/R/lib/libRlapack.so
locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets [8] methods base
other attached packages: [1] vegan_2.4-6 lattice_0.20-35
[3] permute_0.9-4 snpStats_1.26.0
[5] Matrix_1.2-9 survival_2.41-3
[7] plyr_1.8.4 VariantAnnotation_1.22.0
[9] Rsamtools_1.28.0 Biostrings_2.44.0
[11] XVector_0.16.0 SummarizedExperiment_1.6.1 [13] DelayedArray_0.2.0 matrixStats_0.53.1
[15] Biobase_2.36.1 GenomicRanges_1.28.1
[17] GenomeInfoDb_1.12.0 IRanges_2.10.0
[19] S4Vectors_0.14.0 BiocGenerics_0.22.0
[21] MatrixEQTL_2.2
loaded via a namespace (and not attached): [1] Rcpp_0.12.17 compiler_3.4.0 GenomicFeatures_1.28.0
[4] prettyunits_1.0.2 bitops_1.0-6 tools_3.4.0
[7] zlibbioc_1.22.0 progress_1.1.2 biomaRt_2.35.11
[10] digest_0.6.15 nlme_3.1-131 BSgenome_1.44.0
[13] RSQLite_1.1-2 memoise_1.1.0 mgcv_1.8-17
[16] DBI_0.6-1 GenomeInfoDbData_0.99.0 cluster_2.0.6
[19] rtracklayer_1.36.6 stringr_1.3.1 httr_1.3.1
[22] grid_3.4.0 R6_2.2.2 AnnotationDbi_1.38.2
[25] XML_3.98-1.10 BiocParallel_1.10.1 magrittr_1.5
[28] MASS_7.3-47 splines_3.4.0 GenomicAlignments_1.12.0 [31] assertthat_0.2.0 stringi_1.2.3 RCurl_1.95-4.10