griffithlab / pVACtools

http://www.pvactools.org
BSD 3-Clause Clear License
131 stars 58 forks source link

Prefilter pVACseq and pVACsplice on a user-defined list of biotypes #1090

Open susannasiebert opened 3 months ago

susannasiebert commented 3 months ago

Currently, pVACseq does not prefilter transcripts on their biotype. We prioritize protein_coding transcripts when picking the best peptide and transcript in the aggregate report creation.

In pVACsplice we hard-pre-filter on protein_coding transcripts.

Instead we should define a new parameter --biotypes with a default of ['protein_coding'] that applies a prefilter on transcript biotypes both in pVACseq and pVACsplice. The aggregate report will continue to prioritize protein_coding transcripts when selecting the best peptide/transcripts. The docs should explain that "a few other, more speculative types (e.g. non_stop_decay and nonsense_mediated_decay) can give rise to neoantigens if users want to include them, they can specify their own list of biotypes".