UCSF-Costello-Lab / LG3_Pipeline

The original LG3 pipeline
https://github.com/UCSF-Costello-Lab/LG3_Pipeline
0 stars 0 forks source link

SOFTWARE: Consolidate the software stack to a single shared location #17

Closed HenrikBengtsson closed 2 years ago

HenrikBengtsson commented 6 years ago

Migrating from https://github.com/UCSF-Costello-Lab/LG3_Pipeline/issues/12#issuecomment-419574111:

@ivan108 wrote: I think moving all software to public domain is a good idea. Do we need to load corresponding module or use the path?

@HenrikBengtsson replied:

This is what I think needs to be done/path forward/long-term solution:

  1. Identify all software tools currently used, cf. Issue #10
  2. Freeze those in a single central location, e.g. /home/shared/cbc/software_frozen/20180907-LG3_Pipeline
  3. Include run-time sanity tests ("contracts") that assert that we really running these frozen versions.
  4. For legacy versions of programming languages (e.g. Python, Java, R), migrate to latest stable versions.
  5. For legacy versions of sequencing tools, try to migrate to newer versions. For some tools we know there will be major problems, e.g. GATK.

Working through these steps require solid testing to make sure things don't break and results are reproducible. So, testing testing testing...

More comments: Yes, we can also introduce software modules for controlling the above, e.g. module load lg3-pipeline, but lets not worry about that for now.

HenrikBengtsson commented 5 years ago

@ivan108, I had to re-add module load CBC r/3.4.2 in Recal_bigmem.pbs - without it, Rscript was not found when running from a fresh (cbctest2) account. I also thought it's on the path, but it's not. Need more investigation. Let's keep it there for now.

ivan108 commented 5 years ago

Interesting... That was why I added it. Lately it was working without it from jocostello account probably because I have a path in .bashrc: export PATH=$PATH:/home/shared/cbc/software_cbc/R/R-3.4.4-20180315:...

HenrikBengtsson commented 5 years ago

Ok. I haven't tested with cbctest2 since last release, so that could explain why I didn't notice until now.

HenrikBengtsson commented 2 years ago

In the 'next-release' branch, all software is now configured in ${LG_HOME}/lg3.conf. The validation that software tools exists is done in https://github.com/UCSF-Costello-Lab/LG3_Pipeline/blob/e95824fc4ca5438ed6fd13a65b97a98c1826387d/scripts/utils.sh#L356.