Closed kubu4 closed 7 years ago
Working on this on Hyak, it's nominally functional in it's current state, but having difficulties with the walltime argument in sbatch, so the job kills itself after an hour. Still in progress.
Currently running with a 10 day time request. Will update when finished.
We have new PacBio data for Olympia oyster: http://owl.fish.washington.edu/nightingales/O_lurida/20170323_pacbio/
We sequenced 10 SMRT cells. As such, there is a subdirectory for each SMRT cell's data.
Here's the PacBio software recommendations page: https://github.com/PacificBiosciences/Bioinformatics-Training/wiki/Large-Genome-Assembly-with-PacBio-Long-Reads
I think we might as well try the Gap Filling suggestion (using PBJelly). There isn't too much documentation, but I think you can figure it out. After reading the "JellyReadme.txt" file, look at the two .xml files (one in the
jellyExample
folder and theTemplateProtocol.xml
) that are provided to get an idea of what you'll need.For the PacBio input data, I think you'll use the
filtered_subreads.fasta
which are found in the top level of each of the SMRT cell directories (Note: The files we have are gzipped and I don't think PBJelly will accept gzipped input files).For the existing Olympia oyster reference, we have two options (Note: either of the following files will need to be renamed - see the PBJelly readme for explanation):
Scaffold assembly: http://owl.fish.washington.edu/O_lurida_genome_assemblies_BGI/20160314/scaffold.fa.fill
Contig assembly: http://owl.fish.washington.edu/O_lurida_genome_assemblies_BGI/20161201/cdts-hk.genomics.cn/Ostrea_lurida/Ostrea_lurida.fa
I'll let you figure out which to try it with (or, heck, try 'em both)!