hackseq / hackseq_projects_2016

8 stars 2 forks source link

Project 10: Develop an open-source, automated pipeline to close bacterial genomes with long read technologies #1

Open ttimbers opened 8 years ago

ttimbers commented 8 years ago

Project: I would like to make an open-source, readily available automated pipeline to close bacterial genomes with long read technologies and then fixing errors by mapping short reads to them. Folks from Illumina and PacBio have already said they would be happy to help out, and I think Abyss might be one of the tools we could use to work on this.

Project Lead: Ben Busby / @DCGenomics / Genomics Outreach Coordinator / NCBI

sjackman commented 8 years ago

I'm a developer of ABySS, and I'll be at Hackseq leading up project #9. I'm happy to offer up any help with ABySS that you may need.

DCGenomics commented 8 years ago

Awesome!

sjackman commented 8 years ago

We're planning to have a Docker image with a bunch of bioinformatics software preinstalled running on machines at the BC Cancer Agency Genome Sciences Centre during the Hackathon. Which bioinformatics software do you plant to use for your project? In particular, is there any software that you plan to use that is not already listed here? http://www.bcgsc.ca/services/orca

jmicrobe commented 8 years ago

I'm interested in the automated pipeline part, are there any particular tools that are being considered for this project?

DCGenomics commented 8 years ago

Discussed tools for PacBio assembly with @sjackman the other night.

Shaun, can you throw that list of tools up here.

Also, I think one of the best things about these events are that people suggest tools, so if there are tools that people find interesting, post'em up here.

Cheers!

Ben

sjackman commented 8 years ago

Assemble

Polish

jmicrobe commented 8 years ago

Pipeline

sjackman commented 8 years ago

Check out Circlator: http://sanger-pathogens.github.io/circlator/

A tool to circularize genome assemblies. The algorithm and benchmarks are described in the Genome Biology manuscript. Citation: "Circlator: automated circularization of genome assemblies using long sequencing reads", Hunt et al, Genome Biology 2015 Dec 29;16(1):294. doi: 10.1186/s13059-015-0849-0. PMID: 26714481.

jmicrobe commented 8 years ago

I haven't used this, but DBG2OLC is designed to work with a hybrid short/long read assembly, using the short reads as anchors.