Open alistairhockey opened 10 months ago
That is a lot of BED files! The memory issue is familiar with bedtools/pybedtools using large datasets. Can you use the latest versions of bedtools and pybedtools? Also, try to sort your bed files using bedtools
before running the intervene upset
.
We're aiming for a parallel processing option in the upcoming version of Intervene!
I haven't had any issues with bedtools multiinter - but maybe the sorting has played a part in that! Also, have you considered having an option for BEDPE files? I would be interested to see if you could modify the script to use 'pairToPair' in place of 'intersect' to get all the BEDPE paired region combinations.
Hi there,
I am trying to run 'intervene upset'on 73 BED files that have ~40,000 intervals each.
intervene upset -i /data/alistairh/projects/SV_calling/data/peaks/DiscRegions/*{1,2,3}.bed --output SV_calling/data/peaks/DiscRegions/results_RT --save-overlaps
However, Intervene uses up all the available memory (62G) before being killed by the server. Is there a setting or a fix to limit the memory use of Intervene so it doesn't get killed by the server? This hasn't been a problem before when I have used intervene for 15-20 BED files.