griffithlab / pVACtools

http://www.pvactools.org
BSD 3-Clause Clear License
144 stars 59 forks source link

Adding Gene and Transcript Expression to VCF for pVACseq #1159

Open biounix opened 5 days ago

biounix commented 5 days ago

Hi,

If it’s possible to add both gene and transcript expression annotations to the input VCF file for pVACseq (GX and TX fields with vcf-expression-annotator), is it advisable to include both? Or is it better to include just one of them? If the latter, which one would you recommend prioritizing? My guess would be transcript expression since it seems more specific.

Additionally, if both gene and transcript expressions are included, how does pVACseq handle or prioritize this information during processing?

I couldn’t find documentation on this, so apologies if this has been answered elsewhere.

Thank you for your time and for developing such an invaluable toolset!

chrisamiller commented 5 days ago

pVACtools doesn't use the GX/TX information for anything in the binding predictions, etc. It is passed through to the aggregate report table, and is one of the factors used in determining the neoantigen's "Tier". It is also exposed in the pVACview interface, so that you can use that information when prioritizing peptides. To answer your question, if you have both GX and TX, there's generally no reason not to include both, because they can be useful items to think about when thinking about peptide prioritization (even if this particular transcript isn't well-expressed, is the same short peptide part of one of the other transcripts from this well-expressed gene?)

biounix commented 5 days ago

Great! Thanks for the answer.

susannasiebert commented 5 days ago

To add to @chrisamiller we do use both the TX and GX information in the coverage filter (both use the same --expn-val cutoff). This will impact the neoantigen candidates in the .filtered.tsv. For the aggregated report, gene expression is used for the tiering as part of the Allele Expr value, which is gene expression * tumor RNA VAF. You can find more information about the tiers in the aggregate report on this page of our docs.

biounix commented 5 days ago

Thanks for your explanation, @susannasiebert!

Then, if expression values are present but not the tumor RNA VAF information (NA), I understand that the expression criteria for tiering are not applied. Is that right?

susannasiebert commented 5 days ago

Correct, it would "auto-pass" that specific criteria and only consider the other criteria for each tier.