Are there any means to speedup intersect?

I have 4C experiments to process. Naturally, I have MILLIONS of anchor zone reads. And the speed of intersection creation is unbearable slow. Each of the replicates contains almost 15 million reads in total. 4 hours have passed already, bedtools intersect has only processed 3.8 million reads, and the file is growing VERY, VERY slowly. The commandline is:

bedtools intersect -bed -iobuf 100G -sorted -wa -wb -u -g hg38.genome.txt -a rep1.bam -b rep2.bam > intersect.bed

I have PE reads so I don't like the idea to convert it all to bedgraph and then find intersections in 5 seconds. bedtools uses only 1 CPU core, although computer has a plenty of free cores. Ubuntu 22.04 x64, SSD Samsung 980 Pro 1TB. I don't need in 'true' intersection, I need in complete list of unmodified reads/alignments that are intersected in replicates files to obtain the BAM file with intersected reads to process it further with featureCounts. Are there alternatives to bedtools for this task?

arq5x / bedtools2

Are there any means to speedup intersect? #1091