sanjaynagi / rna-seq-pop

Snakemake workflow for Illumina RNA-sequencing experiments - extract population genomic signals from RNA-Seq data
https://sanjaynagi.github.io/rna-seq-pop/
MIT License
18 stars 7 forks source link

allowing custom snpEff databases #39

Closed vmkalbskopf closed 1 year ago

vmkalbskopf commented 3 years ago

A new rule for creating custom snpEff databases.

vmkalbskopf commented 3 years ago

Building a custom database requires modifying snpEff files and directories. However, it is installed through conda by Snakemake, into a randomly named folder in the envs directory. I don't know how to modify the contents of that directory to build the custom DB.

sanjaynagi commented 3 years ago

Ok. What might be easier for now, is just run your snpEff version outside of the pipeline, but produce the files required by the workflow. Think they are called annot.variants.{chrom}.vcf.gz. Then you should be able to continue running the workflow. Cheers

On 9 Jun 2021, at 12:00, vmkalbskopf @.***> wrote:

 Building a custom database requires modifying snpEff files and directories. However, it is installed through conda by Snakemake, into a randomly named folder in the envs directory. I don't know how to modify the contents of that directory to build the custom DB.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub, or unsubscribe.

vmkalbskopf commented 3 years ago

Yeah, that's so much easier. Then perhaps we can add an option to specify the directory with the snpEff jar and config files.

sanjaynagi commented 3 years ago

Yeah thats an option - the workflow actually used to require a manual installation of snpEff 5.0.0 as there was an error when downloading the Aedes_aegypti with snpEff 4.3.1.

But i still probably think the best fix is to replace the snpSift dependency for bcftools, and that way we can use SnpEff 5.0.0 with conda. Better to have less depencies anyway, and bcftools is used anyway and almost certainly more reliable.