andreyshabalin / MatrixEQTL

Matrix eQTL: Ultra fast eQTL analysis via large matrix operations
53 stars 16 forks source link

`noFDRsaveMemory = TRUE` no longer works with `output_file_name.cis` #8

Closed sritchie73 closed 5 years ago

sritchie73 commented 5 years ago

Minimal reproducible example:

First, load the package example data into the appropriate data structures:

library(MatrixEQTL)

# Toy data locations
base.dir = find.package('MatrixEQTL');
SNP_file_name = paste(base.dir, "/data/SNP.txt", sep="");
snps_location_file_name = paste(base.dir, "/data/snpsloc.txt", sep="");
expression_file_name = paste(base.dir, "/data/GE.txt", sep="");
gene_location_file_name = paste(base.dir, "/data/geneloc.txt", sep="");
covariates_file_name = paste(base.dir, "/data/Covariates.txt", sep="");

# Load toy data as per MatrixEQTL vignette
snps = SlicedData$new();
snps$fileDelimiter = "\t";      # the TAB character
snps$fileOmitCharacters = "NA"; # denote missing values;
snps$fileSkipRows = 1;          # one row of column labels
snps$fileSkipColumns = 1;       # one column of row labels
snps$fileSliceSize = 2000;      # read file in slices of 2,000 rows
snps$LoadFile(SNP_file_name);
snpspos = read.table(snps_location_file_name, header = TRUE, stringsAsFactors = FALSE);

gene = SlicedData$new();
gene$fileDelimiter = "\t";      # the TAB character
gene$fileOmitCharacters = "NA"; # denote missing values;
gene$fileSkipRows = 1;          # one row of column labels
gene$fileSkipColumns = 1;       # one column of row labels
gene$fileSliceSize = 2000;      # read file in slices of 2,000 rows
gene$LoadFile(expression_file_name);
genepos = read.table(gene_location_file_name, header = TRUE, stringsAsFactors = FALSE);

cvrt = SlicedData$new();
cvrt$fileDelimiter = "\t";      # the TAB character
cvrt$fileOmitCharacters = "NA"; # denote missing values;
cvrt$fileSkipRows = 1;          # one row of column labels
cvrt$fileSkipColumns = 1;       # one column of row labels
if(length(covariates_file_name)>0) {
  cvrt$LoadFile(covariates_file_name);
}

Matrix_eQTL_main parameters that cause the error:

eQTLs <- Matrix_eQTL_main(
    snps = snps,
    gene = gene,
    cvrt = cvrt,
    pvOutputThreshold = 0, # ignore Trans eQTLs 
    pvOutputThreshold.cis = 1, # but keep all cis summary stats
    snpspos = snpspos,
    genepos = genepos,
    output_file_name=NULL,
    output_file_name.cis="test.txt",
    noFDRsaveMemory=TRUE
)
Matching data files and location files
10of10 genes matched
15of15 SNPs matched

Task finished in 0.00499999999999545 seconds
Processing covariates
Task finished in 0.00200000000006639 seconds
Processing gene expression data (imputation, residualization)
Task finished in 0.000999999999976353 seconds
Creating output file(s)
Task finished in 0.0040000000000191 seconds
Performing eQTL analysis
100.00% done, 100 cis-eQTLs
Error in saver.cis$getResults(gene, snps, n.tests.cis) : 
  unused arguments (gene, snps, n.tests.cis)

The final error message is not returned if we set noFDRsaveMemory = FALSE.

andreyshabalin commented 5 years ago

The error does not occur with the currect GitHub version of Matrix eQTL, only CRAN version.

Note that the CRAN version outputs

10of10 genes matched
15of15 SNPs matched

While the GitHub

10 of 10 genes matched
15 of 15 SNPs matched