I have 4C experiments to process. Naturally, I have MILLIONS of anchor zone reads. And the speed of intersection creation is unbearable slow. Each of the replicates contains almost 15 million reads in total. 4 hours have passed already, bedtools intersect has only processed 3.8 million reads, and the file is growing VERY, VERY slowly.
The commandline is:
I have PE reads so I don't like the idea to convert it all to bedgraph and then find intersections in 5 seconds.
bedtools uses only 1 CPU core, although computer has a plenty of free cores.
Ubuntu 22.04 x64, SSD Samsung 980 Pro 1TB.
I don't need in 'true' intersection, I need in complete list of unmodified reads/alignments that are intersected in replicates files to obtain the BAM file with intersected reads to process it further with featureCounts. Are there alternatives to bedtools for this task?
I have 4C experiments to process. Naturally, I have MILLIONS of anchor zone reads. And the speed of intersection creation is unbearable slow. Each of the replicates contains almost 15 million reads in total. 4 hours have passed already, bedtools intersect has only processed 3.8 million reads, and the file is growing VERY, VERY slowly. The commandline is:
I have PE reads so I don't like the idea to convert it all to bedgraph and then find intersections in 5 seconds. bedtools uses only 1 CPU core, although computer has a plenty of free cores. Ubuntu 22.04 x64, SSD Samsung 980 Pro 1TB. I don't need in 'true' intersection, I need in complete list of unmodified reads/alignments that are intersected in replicates files to obtain the BAM file with intersected reads to process it further with featureCounts. Are there alternatives to bedtools for this task?