hartwigmedical / hmftools

Various algorithms for analysing genomics data
GNU General Public License v3.0
187 stars 58 forks source link

LINX output files: how to achieve breakpoint information for each gene of a fusion? #359

Closed FrancescaMiccolis closed 1 year ago

FrancescaMiccolis commented 1 year ago

Hi. I want to use hmftools to study tumour and normal samples to analyse and prioritize the fusions according to oncogenic probability. In order to perform the prioritization I need to build some files as input for the tools I'm using and that require as mandatory information the breakpoint coordinates for each gene of each fusion. I was trying to obtain this information based on the fields present in the LINX outputs but I am not sure it could be done. I wanted to ask you if there's a way to obtain this information directly from some output or make some manipulation of that data to obtain a stable value for the breakpoint; I was thinking to use the chromosome band, but in that way the value would not be exactly but more a sort or range that is not what I actually want. Thanks for your attention, I'm waiting for your answer.

p-priestley commented 1 year ago

Hello,

There os a 3 step process to link back to vcf coordinates:

  1. Link the fusion file to the breakend file using 'fivePrimebreakendId'/'threePrimebreakendId'
  2. Link the breakend file to the sv file using 'svId'
  3. Link the sv file to the vcf using 'vcfId'

let me know if you have further questions