Clinical-Genomics / BALSAMIC

Bioinformatic Analysis pipeLine for SomAtic Mutations In Cancer
https://balsamic.readthedocs.io/
MIT License
45 stars 16 forks source link

[User Story] Fix recurrent problems with BALSAMIC containers #1165

Open ivadym opened 1 year ago

ivadym commented 1 year ago

Need

As a BALSAMIC developer, I want the recurrent container issues and failures to be resolved so that I can experience improved efficiency and reliability in testing and deploying. This will enable me to work more effectively and confidently with BALSAMIC containers.

Suggested approach

  1. Each container should be atomic and tailored to a specific tool or component to ensure clarity and minimise unnecessary dependencies. It would also imply changing and simplifying Snakemake rules. IVDR compliance: deadline March 2024.

  2. Establish a unified structure and guidelines for the containers to ensure consistency across the repository. This can involve defining a common base image, adopting standard naming conventions, and providing clear documentation on container usage and deployment.

  3. Implement a comprehensive testing strategy for the containers to ensure their functionality and reliability. This can include unit tests, integration tests, and automated validation of the containerised tools.

Considered alternatives

Deviation

No response

Risk assessment

Risk assessment link

No response

System requirements assessed

Requirements affected by this story

No response

SOUPs

No response

Can be closed when

Blockers

No response

Anything else?

No response

khurrammaqbool commented 1 year ago

The following containers are single tool/module containers:

  1. ascatNgs
  2. cadd https://github.com/Clinical-Genomics/BALSAMIC/pull/1222
  3. cnvpytor https://github.com/Clinical-Genomics/BALSAMIC/pull/1246
  4. delly
  5. somalier
  6. vcf2cytosure https://github.com/Clinical-Genomics/BALSAMIC/pull/1159
  7. htslib https://github.com/Clinical-Genomics/BALSAMIC/pull/1234
  8. cnvkit https://github.com/Clinical-Genomics/BALSAMIC/pull/1252
  9. purecn https://github.com/Clinical-Genomics/BALSAMIC/pull/1255
  10. gatk https://github.com/Clinical-Genomics/BALSAMIC/pull/1266

Some containers may contain one or two bioinformatic tools that need to be removed once a separate container is build for those unless those main tools require them as dependencies

The following containers contain multiple tools/modules:

  1. align_qc
  2. annotate
  3. coverage_qc
  4. varcall_cnvkit (removed and replaced with purecn, cnvkit and htslib containers) https://github.com/Clinical-Genomics/BALSAMIC/pull/1278
  5. varcall_py3
  6. varcall_py27