NCBI-Hackathons / ga4gh-ncbi-api

Access data in NCBI using GA4GH methods
MIT License
1 stars 4 forks source link

Support unaligned datasets with on-the-fly magicblast #33

Open pamelarussell opened 7 years ago

pamelarussell commented 7 years ago

To support unaligned NCBI datasets (the vast majority):

(1) Extract target region from reference sequence and create blast db from it (2) Magic-BLAST dataset against region

Example:

#!/usr/bin

# get fasta of region
makeblastdb -dbtype nucl -in chunk_chr1
# blast dataset against region
./ncbi-magicblast-1.2.0/bin/magicblast -sra SRR000006 -num_threads 16 -db chunk_chr1 
david4096 commented 7 years ago

+1 this would be really cool, batch alignments may be a thing of the past!