hartwigmedical / hmftools

Various algorithms for analysing genomics data
GNU General Public License v3.0
187 stars 58 forks source link

Hotspot VCF - transcript as null #347

Closed dmaziec closed 1 year ago

dmaziec commented 1 year ago

Hello,

Thank you for your toolkit and contribution to cancer research.

I would like to ask about the hotspot VCF used by HMF (KnownHotspots.38.vcf.gz). In the INFO column, there is a field input that stores the information about the gene, transcript and protein mutation. For some variants, the transcript is given as a null value.

Example: chr1 114713941 . G A . . input=NRAS|null|p.T50I;sources=vicc_cgi

Could you please explain what it means? How to interpret this value?

Thank you. All the best, Dominika

p-priestley commented 1 year ago

Hi Dominika,

To match hotspots in our pipeline we use chromosome, posiiton, ref & alt. So the transcript and position are for informational purposes only. Hotspot annotation primarily affects calling sensitivity in SAGE and driver annotation in PURPLE.

I believe the null transcript itself was generated via one of import process which came from vicc, but this is not relevant to our use of the data. We use PAVE (https://github.com/hartwigmedical/hmftools/tree/master/pave) to annotate transcripts and effect.

Peter

dmaziec commented 1 year ago

Hi Peter,

Thank you for your prompt response.

Best wishes, Dominika