Open HenrikBengtsson opened 5 years ago
What genome do they need?
A GRCh37 reference (i.e. chromosomes/sequences does not have the chr
prefix), e.g.
$ head -2 /home/shuntsman/ref/broad/Homo_sapiens_assembly19.fasta
>1 dna:chromosome chromosome:GRCh37:1:1:249250621:1
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
(the filename does not reflect GRCh37 but the file content does)
I am glad it is not GRCh38... :) The easiest way is to rename chromosomes in the reference fasta file, and rebuild bwa indexes. Otherwise we will have to rename chromosomes in all annotations..
Ok, thxs. We decided to stick with the current hg19 (the unknowns in this pipelines are too many and the pay off might be zero) - the rationale for using GRCh37 is not really there.
Background
Ziv lab needs to switch the genome reference file for their needs. The first step this comes in to the pipeline is the BWA alignment step.
Task(s)
Make it possible to change the genome reference FASTA file for the alignment step. After that, look at the remaining steps and what other reference files that needs to be updated as well.