BreakerLab / dimpl

DIMPL: Discovery of Intergenic Motifs PipeLine
MIT License
3 stars 3 forks source link

Nextflow Workflow/Process: Build Search Database #28

Open kenibrewer opened 10 months ago

kenibrewer commented 10 months ago

DIMPL requires a search database that consists of a large collection of bacterial genomes that have had all their protein-coding regions stripped out. In the current version of DIMPL, a fixed search database is provided via GLOBUS-FTP.

DIMPL v2 should support the building of custom search databases based on a collection of genome fastas and annotation files. This should be implemented via a Nextflow workflow that processes a samplesheet consisting of genome annotation file pairs and runs those files through an extract IGR process.