Run SamBamba + FreeBayes on an PODK node against the 234G version of NA12878

BD2KGenomics / adam-contest-eval

Run ADAM variant calling against the 1000genomes.org dataset

3 stars 1 forks source link

Run SamBamba + FreeBayes on an PODK node against the 234G version of NA12878 #4

Closed hannes-ucsc closed 9 years ago

hannes-ucsc commented 9 years ago

Beau knows where to find that BAM

almussel commented 9 years ago

We are running these as a benchmark to compare to ADAM, yes? Which of the programs in sambamba are the ones we're concerned with, and what data on those runs do we care about?

beaunorgeot commented 9 years ago

The goal is an end-to-end germline pipeline; taking a bam as input and outputting qualified variants. Important data points are time (for each step in the pipeline) and variant concordance (w/in acceptable standards). Does this make sense? I can chat this afternoon if you'd like

On Mon, Apr 20, 2015 at 11:36 AM, mbaudrey notifications@github.com wrote:

We are running these as a benchmark to compare to ADAM, yes? Which of the programs in sambamba are the ones we're concerned with, and what data on those runs do we care about?

— Reply to this email directly or view it on GitHub https://github.com/BD2KGenomics/adam-1000-genomes/issues/4#issuecomment-94532774 .

fnothaft commented 9 years ago

@mbaudrey you'd want sort and markdups from sambamba. You should be able to run these through a pipe with freebayes.

hannes-ucsc commented 9 years ago

Status in progress, currently installed tools on m3.xlarge instance. Will need to restart as r3.???.

almussel commented 9 years ago

What is the most appropriate reference genome to use for this? Freebayes needs one.

beaunorgeot commented 9 years ago

You need your reference genome to match your sample. Are you able to determine which reference your sample was aligned/mapped to? If so, that's the reference you want. Let me know if you need help determining the alignment sequence

On Fri, Apr 24, 2015 at 3:19 PM, mbaudrey notifications@github.com wrote:

What is the most appropriate reference genome to use for this? Freebayes needs one.

— Reply to this email directly or view it on GitHub https://github.com/BD2KGenomics/adam-1000-genomes/issues/4#issuecomment-96081547 .