Clinical-Genomics / BALSAMIC

Bioinformatic Analysis pipeLine for SomAtic Mutations In Cancer
https://balsamic.readthedocs.io/
MIT License
44 stars 16 forks source link

Reference genome hg38 #301

Closed hassanfa closed 4 years ago

hassanfa commented 4 years ago

Workflows in balsamic are agnostic of which reference input is provided. But the correct reference still file needs to be provided. This also affects target sequencing, as all of our panel are hg19 coordinates.

How to add correct reference genome. Possible solutions:

  1. Create two independent reference.json

Pro: doesn't need any changes to cli, one can make various reference.json files for different versions of genome. This can even handle other organisms! Contra: dumps bulk of work on reference generation, complexity of reference workflow will increase, not all versions of genome have same file names.

  1. provide genome version as a CLI option, and BALSAMIC takes the correct version based on that

Pro: easier on user side, Contra: can't handle unconventional reference files or organisms with ease, won't be agnostic to whatever reference.json is provided, Still requires a lot of change to reference.json,

BALSAMIC version <4.2.2

Current affected rule if snakemake workflow related reference generation workflow

hassanfa commented 4 years ago

@keyvanelhami Have a look at this and lets discuss it on Monday.

hassanfa commented 4 years ago

BALSAMIC part of this is done. Rest will be done in other repositories. PRs were: #407 and #438