Run each chromosome/contig independently

pauline-ng / SIFT4G_Create_Genomic_DB

Create genomic databases with SIFT predictions. Input is an organism's genomic DNA (.fa) file and the gene annotation file (.gtf). Output will be a database that can be used with SIFT4G_Annotator.jar to annotate VCF files.

GNU General Public License v3.0

22 stars 7 forks source link

Hi Pauline,

I am trying to create a database for a mammalian genome on RefSeq. The run time is quite long (several days), and occasionally the procedure fails due to various errors (out of memory, for instance). I am wondering if it would be okay to parallelize the database creation by running it independently for each chromosome/contig. Then, after all jobs have completed, I would keep the .gz, .regions, and *_SIFTDB_stats.txt from each /. Do you think this would be okay?

Thanks, Jacqueline

pauline-ng / SIFT4G_Create_Genomic_DB

Run each chromosome/contig independently #17