arq5x / bedtools

A powerful toolset for genome arithmetic.
http://code.google.com/p/bedtools/
GNU General Public License v2.0
139 stars 86 forks source link

Multiinter for BEDPE files? #168

Open alistairhockey opened 1 year ago

alistairhockey commented 1 year ago

Hi there,

I am wanting to find intersecting paired regions from a number of BEDPE files, like for BED files using multiinter.

Say we have three bedpe files:

file 1:
chr1 324 330 chr2 500 560 idA
chr1 424 500 chr3 200 260 idA
chr1 505 550 chr2 400 490 idA

file 2:
chr1 320 330 chr2 505 560 idB
chr1 420 480 chr3 220 260 idB
chr1 800 880 chr2 100 120 idB

file 3:
chr1 325 330 chr2 540 560 idC
chr1 120 180 chr3 20 40 idC
chr1 505 550 chr2 400 490 idC

Is there a process so that those pairs (lines) that intersect for both regions ($1_1 = $1_2, $2_1 - $3_1 intersect with $2_2 - $3_2, $4_1 = $4_2, $5_1 - $6_1 intersect with $5_2 - $6_2) are merged with a count and id names. Those that are unique are also provided in the same format.

I want the final file to look like:

chr1    start1  end1    chr2    start2  end2    ids
chrA    120 180 chrC    20  40  idC
chrA    320 330 chrB    500 560 idA;idB;idC
chrA    420 500 chrC    200 260 idA;idB
chrA    505 550 chrB    400 490 idA 
chrA    800 880 chrB    100 120 idB

I have scoured the web trying to find a solution and unfortunately I am not so well-versed to create something myself.

Cheers!