smith-chem-wisc / Spritz

Software for RNA-Seq analysis to create sample-specific proteoform databases from RNA-Seq data
https://smith-chem-wisc.github.io/Spritz/
MIT License
7 stars 11 forks source link

Possible to obtain the SnpEff version used in Spritz? #190

Closed finnbenned closed 3 years ago

finnbenned commented 4 years ago

Hi,

I am really impressed with your Spritzversion 0.2.2. Here it seems that you are using a special version of SnpEff (info from the vcf file combined.Spritz.sneff.vcf: "SnpEffVersion="4.3u (build 2020-06-25 22:57), by Pablo Cingolani". This outputs an extremely useful and well-structured fasta file with e.g. variants in a protein fasta being combined but dependent on whether such variants are homo- or heterozygous - very neat!

Would it be possible to obtain this SnpEff build which appears quite different from the available 4.3u version the GitHub SNpEff site?

Best, Finn

trishorts commented 4 years ago

Thanks for your question. We will try to respond soon.

acesnik commented 4 years ago

Great to hear! Yes, the customized version of of SnpEff is publicly available here: https://github.com/smith-chem-wisc/SnpEff

acesnik commented 4 years ago

And this is the current release of that fork: https://github.com/smith-chem-wisc/SnpEff/releases/tag/4.3_SCW1

acesnik commented 4 years ago

I should also clarify that the combinatorics regarding homo- and heterozygous variants is part of this Spritz module: https://github.com/smith-chem-wisc/Spritz/tree/master/Spritz/TransferUniProtModifications. The fasta file output by SnpEff does not have variants applied with any combinatorics.

acesnik commented 4 years ago

(I edited my previous comment: the fasta output by SnpEff (not Spritz) does not have variants applied with combinatorics.)

acesnik commented 3 years ago

Hi Finn,

I hope this answered your question about the Spritz's custom SnpEff version and the subsequent Spritz scripts.

I also wanted to update you that we released Spritz 0.2.3 with some improvements to the types of inputs that can be used. Single-end RNA-Seq data and mixed input data types can now be used.

Spritz was also accepted for publication at J. Proteome Res. since August, which you can find here if that is of interest: https://pubs.acs.org/doi/abs/10.1021/acs.jproteome.0c00407.

Best regards,

Anthony