corbinq / apex

Toolkit for QTL mapping and meta-analysis.
https://corbinq.github.io/apex/
16 stars 1 forks source link

--low-mem does not work with cis/trans LMMs #7

Open hsun3163 opened 3 years ago

hsun3163 commented 3 years ago

Dear Apex author,

I have encountered a issue in using LMM for Apex Cis/Trans

For the cis analysis, with the following commands:

apex cis --vcf /mnt/mfs/statgen/neuro-apex/ROSMAP-vcf/ROSMAP_chr19.vcf.gz \
--bed /mnt/mfs/statgen/neuro-apex/pipeline_testing/cache/test.chr19.mol_phe.bed.gz \
--cov /mnt/mfs/statgen/neuro-apex/pipeline_testing/cache/test.cov.gz \
--out /mnt/mfs/statgen/neuro-apex/pipeline_testing/trans/test.19 \
--grm /mnt/mfs/statgen/neuro-apex/pipeline_testing/data/GRM_default.txt --low-mem

This error has occurred:

Using 48 threads.
19 present in both bcf and bed file.
976562 total variants on selected chromosomes.

Found 1196 samples in bcf file ... 
Found 664 samples in covariate file ... 
Found 665 samples in expression bed file ... 
Found 664 samples in common across all three files.

Processed data for 5 covariates across 664 samples.
Processed expression for 3 genes across 664 samples.
Processed variant data for 976562 variants.

Found 661 related individuals ... 
Reordering related GRM blocks ... Done.
GRM eigendecomposition ... Done.
Selected 664 eigenvectors.
Scaling expression traits ... 
Calculating partial rotations ...
apex: /mnt/c/Users/Admin/Desktop/eQTL_TOOL/CODE/eigen/Eigen/src/Core/Product.h:98: Eigen::Product<Lhs, Rhs, Option>::Product(const Lhs&, const Rhs&) [with _Lhs = Eigen::Transpose<Eigen::SparseMatrix<double> >; _Rhs = Eigen::SparseMatrix<double>; int Option = 2; Eigen::Product<Lhs, Rhs, Option>::Lhs = Eigen::Transpose<Eigen::SparseMatrix<double> >; Eigen::Product<Lhs, Rhs, Option>::Rhs = Eigen::SparseMatrix<double>]: Assertion `lhs.cols() == rhs.rows() && "invalid matrix product" && "if you wanted a coeff-wise or a dot product use the respective explicit functions"' failed.

Killed.

Similarly, for the trans analysis, with the following command:

apex trans --vcf /mnt/mfs/statgen/neuro-apex/ROSMAP-vcf/ROSMAP_chr19.vcf.gz \
--bed /mnt/mfs/statgen/neuro-apex/pipeline_testing/cache/test.chr19.mol_phe.bed.gz \
--cov /mnt/mfs/statgen/neuro-apex/pipeline_testing/cache/test.cov.gz \
--out /mnt/mfs/statgen/neuro-apex/pipeline_testing/trans/test.19 \
--grm /mnt/mfs/statgen/neuro-apex/pipeline_testing/data/GRM_default.txt --low-mem

The same error has occurred:

Using 48 threads.
976562 total variants on selected chromosomes.

Found 1196 samples in bcf file ... 
Found 664 samples in covariate file ... 
Found 665 samples in expression bed file ... 
Found 664 samples in common across all three files.

Processed data for 5 covariates across 664 samples.
Processed expression for 3 genes across 664 samples.
Processed variant data for 976562 variants.

Found 661 related individuals ... 
Reordering related GRM blocks ... Done.
GRM eigendecomposition ... Done.
Selected 664 eigenvectors.
Scaling expression traits ... 
Calculating partial rotations ...
apex: /mnt/c/Users/Admin/Desktop/eQTL_TOOL/CODE/eigen/Eigen/src/Core/Product.h:98: Eigen::Product<Lhs, Rhs, Option>::Product(const Lhs&, const Rhs&) [with _Lhs = Eigen::Transpose<Eigen::SparseMatrix<double> >; _Rhs = Eigen::SparseMatrix<double>; int Option = 2; Eigen::Product<Lhs, Rhs, Option>::Lhs = Eigen::Transpose<Eigen::SparseMatrix<double> >; Eigen::Product<Lhs, Rhs, Option>::Rhs = Eigen::SparseMatrix<double>]: Assertion `lhs.cols() == rhs.rows() && "invalid matrix product" && "if you wanted a coeff-wise or a dot product use the respective explicit functions"' failed.

Killed.

The GRM matrix used are shown as followed:

#id1 | id2 | kinship
-- | -- | --
SM-CJEL7 | SM-CJEL7 | 1.0
SM-CJEL7 | SM-CJFL7 | 0.1
SM-CJEL7 | SM-CJEKR | 0.1
SM-CJEL7 | SM-CJFLS | 0.1
SM-CJEL7 | SM-CJEKO | 0.1
SM-CJEL7 | SM-CJEKP | 0.1
SM-CJEL7 | SM-CJEL6 | 0.1
SM-CJEL7 | SM-CJFLM | 0.1

The same GRM matrix can be successfully used in the apex lmm with the following output.

Using 48 threads.
976562 total variants on selected chromosomes.

Found 1196 samples in bcf file ... 
Found 664 samples in covariate file ... 
Found 665 samples in expression bed file ... 
Found 664 samples in common across all three files.

Processed data for 5 covariates across 664 samples.
Processed expression for 3 genes across 664 samples.
Processed variant data for 976562 variants.
corbinq commented 3 years ago

Hi there,

Thanks for reaching out! It looks like we have an issue using --low-mem with LMM association analysis. The --low-mem option only affects how genotypes are loaded, so it should not be an issue for fitting null models, as you noticed.

I'll make a note to patch this. In the meantime, your best bet is to avoid using --low-mem for LMM association analysis.

If you find that memory usage is prohibitive (i.e., you have many variants and many samples), then you can manually specify smaller chunks using the --region and --bed-region flags.

Best, Corbin