open2c / cooler

A cool place to store your Hi-C
https://open2c.github.io/cooler
BSD 3-Clause "New" or "Revised" License
196 stars 52 forks source link

Can't create cooler at super high resolution #286

Open Phlya opened 1 year ago

Phlya commented 1 year ago

I tried to make an mcool'er for the mouse genome with 10 bp resolution. I get this error, indicating integer overflow...

    File "/tungstenfs/scratch/ggiorget/ilya/condaenvs/distiller-env/lib/python3.7/site-packages/cooler/reduce.py", line 851, in zoomify_cooler
      **kwargs
    File "/tungstenfs/scratch/ggiorget/ilya/condaenvs/distiller-env/lib/python3.7/site-packages/cooler/reduce.py", line 727, in coarsen_cooler
      **kwargs
    File "/tungstenfs/scratch/ggiorget/ilya/condaenvs/distiller-env/lib/python3.7/site-packages/cooler/create/_create.py", line 644, in create
      file_path, target, meta.columns, iterable, h5opts, lock
    File "/tungstenfs/scratch/ggiorget/ilya/condaenvs/distiller-env/lib/python3.7/site-packages/cooler/create/_create.py", line 214, in write_pixels
      for i, chunk in enumerate(iterable):
    File "/tungstenfs/scratch/ggiorget/ilya/condaenvs/distiller-env/lib/python3.7/site-packages/cooler/reduce.py", line 616, in __iter__
      results = self._map(self.aggregate, spans[i : i + batchsize])
    File "/tungstenfs/scratch/ggiorget/ilya/condaenvs/distiller-env/lib/python3.7/site-packages/multiprocess/pool.py", line 268, in map
      return self._map_async(func, iterable, mapstar, chunksize).get()
    File "/tungstenfs/scratch/ggiorget/ilya/condaenvs/distiller-env/lib/python3.7/site-packages/multiprocess/pool.py", line 657, in get
      raise self._value
    File "/tungstenfs/scratch/ggiorget/ilya/condaenvs/distiller-env/lib/python3.7/site-packages/multiprocess/pool.py", line 431, in _handle_tasks
      put(task)
    File "/tungstenfs/scratch/ggiorget/ilya/condaenvs/distiller-env/lib/python3.7/site-packages/multiprocess/connection.py", line 209, in send
      self._send_bytes(_ForkingPickler.dumps(obj))
    File "/tungstenfs/scratch/ggiorget/ilya/condaenvs/distiller-env/lib/python3.7/site-packages/multiprocess/connection.py", line 396, in _send_bytes
      header = struct.pack("!i", n)
  struct.error: 'i' format requires -2147483648 <= number <= 2147483647
nvictus commented 1 year ago

The error is coming from multiprocessing/pickle, and doesn't seem to be an encoding issue for storage. Might be resolvable with a smaller chunk size. Or a newer Python version where multiprocessing uses pickle protocol 4 instead of 3.