open2c / coolpuppy

A versatile tool to perform pile-up analysis on Hi-C data in .cool format.
MIT License
77 stars 11 forks source link

ValueError: operands could not be broadcast together with shapes (125,125) (127,127) #135

Closed NMaziak closed 5 months ago

NMaziak commented 10 months ago

Hello, Thank you for creating such nice tools!

I recently wanted to create aggregate plots of different regions of the genome, but I keep getting an error that has been a bit hard to figure out. It could be something very obvious, and if so, I'm sorry for not catching it.

When I run this code: coolpup.py cooler_files/late_100bp.mcool::resolutions/1600 regions_sorted_500.bed --features_format bed --outname test_1.clpy --nshifts 100 --by_distance 50000 100000 500000 1000000 --nproc 20 --log DEBUG

The error is the following:

DEBUG:coolpuppy:Namespace(by_distance=['50000', '100000', '500000', '1000000'], by_strand=False, by_window=False, clr_weight_name='weight', cool_path='cooler_files/late_100bp.mcool::resolutions/1600', coverage_norm='', expected=None, features='regions_sorted_500.bed', features_format='bed', flank=100000, flip_negative_strand=False, groupby=None, ignore_diags=2, ignore_group_order=False, local=False, logLevel='DEBUG', maxdist=None, maxshift=1000000, mindist=None, minshift=100000, n_proc=20, nshifts=100, ooe=True, outname='test_1.clpy', post_mortem=False, rescale=False, rescale_flank=1.0, rescale_size=99, seed=None, store_stripes=False, subset=0, trans=False, view=None)
DEBUG:coolpuppy:1.60% of intervals fall within the same bin as another interval. These are all included in the pileup.
DEBUG:coolpuppy:Loading data
DEBUG:coolpuppy:Loading data
DEBUG:coolpuppy:Loading data
DEBUG:coolpuppy:Loading data
DEBUG:coolpuppy:Loading data
DEBUG:coolpuppy:Loading data
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/linuxbrew/.linuxbrew/Cellar/python@3.8/3.8.16/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/linuxbrew/.linuxbrew/Cellar/python@3.8/3.8.16/lib/python3.8/multiprocessing/pool.py", line 51, in starmapstar
    return list(itertools.starmap(args[0], args[1]))
  File "/home/research/MicroC_stages/aggregate_analysis/agreg_env/lib/python3.8/site-packages/coolpuppy/coolpup.py", line 1342, in pileup_region
    final = self.accumulate_stream(
  File "/home/research/MicroC_stages/aggregate_analysis/agreg_env/lib/python3.8/site-packages/coolpuppy/coolpup.py", line 1265, in accumulate_stream
    outdict["ROI"]["all"] = reduce(
  File "/home/research/MicroC_stages/aggregate_analysis/agreg_env/lib/python3.8/site-packages/coolpuppy/lib/puputils.py", line 100, in sum_pups
    "data": pup1["data"] + pup2["data"],
ValueError: operands could not be broadcast together with shapes (125,125) (127,127) 
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/research/MicroC_stages/aggregate_analysis/agreg_env/bin/coolpup.py", line 8, in <module>
    sys.exit(main())
  File "/home/research/MicroC_stages/aggregate_analysis/agreg_env/lib/python3.8/site-packages/coolpuppy/CLI.py", line 533, in main
    pups = pileup(
  File "/home/research/MicroC_stages/aggregate_analysis/agreg_env/lib/python3.8/site-packages/coolpuppy/coolpup.py", line 2212, in pileup
    pups = PU.pileupsByDistanceWithControl(
  File "/home/research/MicroC_stages/aggregate_analysis/agreg_env/lib/python3.8/site-packages/coolpuppy/coolpup.py", line 1768, in pileupsByDistanceWithControl
    normalized_pileups = self.pileupsWithControl(
  File "/home/research/MicroC_stages/aggregate_analysis/agreg_env/lib/python3.8/site-packages/coolpuppy/coolpup.py", line 1480, in pileupsWithControl
    pileups = list(p.starmap(f, zip(regions1, regions2)))
  File "/home/linuxbrew/.linuxbrew/Cellar/python@3.8/3.8.16/lib/python3.8/multiprocessing/pool.py", line 372, in starmap
    return self._map_async(func, iterable, starmapstar, chunksize).get()
  File "/home/linuxbrew/.linuxbrew/Cellar/python@3.8/3.8.16/lib/python3.8/multiprocessing/pool.py", line 771, in get
    raise self._value
ValueError: operands could not be broadcast together with shapes (125,125) (127,127) 

Would you happen to know what is going wrong? I'm working with version 1.1.0.

Last, when I run the code without setting distances: coolpup.py cooler_files/late_100bp.mcool::resolutions/1600 regions_sorted_500.bed --features_format bed --outname test_1.clpy --nshifts 100 --mindist 25000 --maxdist 1500000 --nproc 10 --log DEBUG

everything seems to work fine:

DEBUG:coolpuppy:1.60% of intervals fall within the same bin as another interval. These are all included in the pileup.
DEBUG:coolpuppy:Loading data
DEBUG:coolpuppy:Loading data
DEBUG:coolpuppy:Loading data
DEBUG:coolpuppy:Loading data
DEBUG:coolpuppy:Loading data
DEBUG:coolpuppy:Loading data
INFO:coolpuppy:('4', '4'): 35
INFO:coolpuppy:('X', 'X'): 252
INFO:coolpuppy:('2L', '2L'): 438
INFO:coolpuppy:('3L', '3L'): 493
INFO:coolpuppy:('2R', '2R'): 599
INFO:coolpuppy:('3R', '3R'): 1015
INFO:coolpuppy:Total number of piled up windows: 2832
INFO:coolpuppy:Saved output to test_1.clpy

I appreciate any help, and thank you again!

Noura

efriman commented 10 months ago

Hi Noura,

That's strange, I'm not sure why you get this error. Do you mind sharing the region file and which dm assembly you're using? I can try to see if I can reproduce it.

Elias

NMaziak commented 10 months ago

Hi Elias,

Of course, attached is the regions bed file. The genome is dm6 (ncbi). Let me know if any other information is needed.

regions_sorted_500.txt

Best, Noura

efriman commented 10 months ago

There's nothing obviously wrong with your file and I don't get the same error. I can only think it has something to do with multiprocessing. Can you try lowering --nproc and --nshifts and see when the problem arises?

NMaziak commented 5 months ago

Hi just wanted to say this got resolved once we changed clusters. Can't really tell you why, I used the same environment, But thanks again for the help!