hammerlab / biokepi

Bioinformatics Ketrew Pipelines
Apache License 2.0
27 stars 4 forks source link

Add support for multi-sample indel realignment #125

Closed iskandr closed 8 years ago

iskandr commented 8 years ago

When doing indel realignment on multiple samples from the same patient (e.g. normal + tumors), we almost certainly to perform indel realignment jointly on the samples (rather than independently).

This can be accomplished by providing all the BAM files as inputs to target creator and indel realigner, and then specifying -nWayOut realigned.bam

For more information see:

arahuja commented 8 years ago

@timodonnell opened an issue on this earlier: #41

iskandr commented 8 years ago

Thanks @arahuja, didn't see that.

What's the proper github etiquette, should I close this issue and copy my text over to that one?

arahuja commented 8 years ago

Not sure there is any? Up to you I guess?

iskandr commented 8 years ago

Closing duplicate, #41.

Here's the text from @timodonnell:

My understanding of https://github.com/hammerlab/biokepi/blob/77abf040aa85cfe247a21df15e98bf68fc192839/src/lib/biokepi_common_pipelines.ml#L41 is that we are running the indel realigner separately for tumor / normal. Wenyi Wang from MD Andersen says we'll get better results if we run it jointly (which I think means just giving the gatk multiple both BAMs by specifiying the -I option twice). We should support this in biokepi (and test if we in fact get different results).

smondet commented 8 years ago

Just pointing out that @iskandr second link says that the Broad indel-realigns bams separately “in production.”

(I'm implementing this anyway; I just go lost in GATK's documentation ☺)