sigven / pcgr

Personal Cancer Genome Reporter (PCGR)
https://sigven.github.io/pcgr
MIT License
253 stars 48 forks source link

Feature: use single conda environment for PCGR #183

Open pdiakumis opened 2 years ago

pdiakumis commented 2 years ago

Opening this issue to keep track of our efforts to handle the PCGR conda installation within a single conda environment, as opposed to two separate Python and R-based ones which is our current setup in dev (as of PCGR v0.10.15). The problem occurs when trying to install R version 4 in the current dev pcgr conda environment on Linux:

# pcgr_full.yml file

name: pcgr

channels:
  - pcgr
  - conda-forge
  - bioconda
  - defaults

dependencies:
  - pcgr ==0.10.15
  - bedtools ==2.30.0
  - cyvcf2 ==0.30.11
  - ensembl-vep ==105.0
  - htslib ==1.10.2
  - pandoc ==2.16
  - perl-bio-bigfile ==1.07
  - vcfanno ==0.3.3
  - vt
  - vcf2maf ==1.6.21
  - r-base ==4
$ mamba env create --file pcgr_full.yml

Looking for: ['pcgr==0.10.15', 'bedtools==2.30.0', 'cyvcf2==0.30.11', 'ensembl-vep==105.0', 'htslib==1.10.2', 'pandoc==2.16', 'perl-bio-bigfile==1.07', 'vcfanno==0.3.3', 'vt', 'vcf2maf==1.6.21', 'r-base==4']

Encountered problems while solving:
  - package cyvcf2-0.30.11-py27h3ce6e29_0 requires python_abi 2.7.* *_cp27mu, but none of the providers can be installed

Unpinning different combinations of cyvcf2, ensembl-vep, htslib etc. leads to different dependency issues, ranging from pulling in VEP < 90, cyvcf2 from 3 years ago, issues with r-base/perl dependencies etc. etc.. Combine that with trying to handle the MacOS installation and you get into a pretty tricky balancing act. Mind you, if I unpin r-base, it pulls in version 3.6.3 and the environment gets solved fine on Linux. On MacOS, it successfully pulls in 4.0.3 (at least now).

This balancing act was the main reason we decided to break the conda dependencies into two separate conda envs. If there comes a time when the dependency graph above gets solved, we'll be able to streamline the installation and simplify the code a lot more than it currently is.