EichlerLab / smrtsv2

Structural variant caller
MIT License
53 stars 6 forks source link

How to run smrtsv2 with fastq/fasta files #24

Closed bioysu closed 5 years ago

bioysu commented 5 years ago

I have download PacBio reads from NCBI SRA in sra format. I can extract fastq/fasta files from sra files. How can I use these files as input for smrtsv2?

paudano commented 5 years ago

SMRT-SV will not be able to use data from FASTA or FASTQ files. The output from sequencers is in a PacBio BAM format, and it contains annotations needed to polish assembled contigs. Without the annotations in the original BAM files, the pipeline will fail during contig assembly.

To run this sample, you will need to obtain the PacBio subread BAM files for this sample, which are output by the Sequel sequencing platform. If the data are from the RS II platform and you can get the .bax.h5 files, use bax2bam to convert each cell to a PacBio BAM (3 .bax.h5 files per cell). bax2bam is a PacBio utility, and SMRT-SV creates when dependencies are built (it will appear as dep/bin/bax2bam).