rdocking / fusebench

A workbench for aggregation and interpretation of RNA-Seq gene fusions
Other
5 stars 1 forks source link

Create merged annotation source #57

Closed rdocking closed 6 years ago

rdocking commented 6 years ago

From the existing BEDPE-formatted annotation files, create a new BEDPE-formatted output file that indicates presence/absence for a given fusion in a given annotation source.

E.g., given, say, TCGA and CIViC annotation files, produce a file that looks like either:

chrom1 start1  end1    chrom2  start2  end2    name    score   strand1 strand2 orient1 orient2 tcga civic
chr1 1 2 chr2 1 2 FOO-BAR 100 + + + + 1 0

Or:

chrom1 start1  end1    chrom2  start2  end2    name    score   strand1 strand2 orient1 orient2  anno_sources
chr1 1 2 chr2 1 2 FOO-BAR 100 + + + + tcga,civic
wilcas commented 6 years ago

Somewhat done via merge_annotations.sh. However, this is solution only works pairwise (i.e. two bedpe files at a time). Moreover, if overlapping intervals exist, there is no function to merge these intervals into a single set of merged coordinates (a la bedtools merge single observation bed files).