higlass / clodius

Clodius is a tool for breaking up large data sets into smaller tiles that can subsequently be displayed using an appropriate viewer.
MIT License
39 stars 21 forks source link

clodius aggregate bigwig throws "RuntimeError: Unable to create attribute (Object header message is too large)" for large genomes #12

Open golobor opened 7 years ago

golobor commented 7 years ago

I am running clodius aggregate on a bigwig that was mapped to a poorly assembled genome with 23k contigs. I intend to show only <40 chromosomes and I created a negspy genome using the reduced set of chromsizes. However, aggregate still writes the full chromsizes from the bigwig header into the hitile, which overflows the limit set by HDF5 (see below the error at line aggregate.py:542). I believe this error could be fixed by writing the chromsizes from the negspy genome instead?..

me@mymachine:$ clodius aggregate bigwig -a bigGenomeReducedTo40Chroms data.fc.signal.bw

Traceback (most recent call last): File "/home/golobor/miniconda3/bin/clodius", line 11, in sys.exit(cli()) File "/home/golobor/miniconda3/lib/python3.6/site-packages/click/core.py", line 722, in call return self.main(args, kwargs) File "/home/golobor/miniconda3/lib/python3.6/site-packages/click/core.py", line 697, in main rv = self.invoke(ctx) File "/home/golobor/miniconda3/lib/python3.6/site-packages/click/core.py", line 1066, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/golobor/miniconda3/lib/python3.6/site-packages/click/core.py", line 1066, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/golobor/miniconda3/lib/python3.6/site-packages/click/core.py", line 895, in invoke return ctx.invoke(self.callback, ctx.params) File "/home/golobor/miniconda3/lib/python3.6/site-packages/click/core.py", line 535, in invoke return callback(args, **kwargs) File "/home/golobor/miniconda3/lib/python3.6/site-packages/clodius/cli/aggregate.py", line 1026, in bigwig _bigwig(filepath, chunk_size, zoom_step, tile_size, output_file, assembly, chromosome) File "/home/golobor/miniconda3/lib/python3.6/site-packages/clodius/cli/aggregate.py", line 542, in _bigwig d.attrs['chrom-names'] = [s.encode('utf-8') for s in bwf.chroms().keys()] File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/h5py_1490028290543/work/h5py/_objects.c:2846) File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/h5py_1490028290543/work/h5py/_objects.c:2804) File "/home/golobor/miniconda3/lib/python3.6/site-packages/h5py/_hl/attrs.py", line 93, in setitem self.create(name, data=value, dtype=base.guess_dtype(value)) File "/home/golobor/miniconda3/lib/python3.6/site-packages/h5py/_hl/attrs.py", line 188, in create attr = h5a.create(self._id, self._e(tempname), htype, space) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/h5py_1490028290543/work/h5py/_objects.c:2846) File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/h5py_1490028290543/work/h5py/_objects.c:2804) File "h5py/h5a.pyx", line 47, in h5py.h5a.create (/home/ilan/minonda/conda-bld/h5py_1490028290543/work/h5py/h5a.c:2075) RuntimeError: Unable to create attribute (Object header message is too large)

golobor commented 7 years ago

the following solution seems to work: in aggregate.py:542-543, replace d.attrs['chrom-names'] = [s.encode('utf-8') for s in bwf.chroms().keys()]
d.attrs['chrom-sizes'] = list(bwf.chroms().values())

with _d.attrs['chrom-names'] = [s.encode('utf-8') for s in nc.get_chromorder(assembly)] d.attrs['chrom-sizes'] = [chrom_info.cum_chrom_lengths[s] for s in nc.getchromorder(assembly)]

rdacemel commented 7 years ago

Hi! Thanks for being looking at this issue. Your solution does not seem to be working in my case though. Note that the following line of the traceback is the line of code suggested above:

File "/home/rafa/programs/miniconda2/lib/python2.7/site-packages/clodius/cli/aggregate.py", line 575, in _bigwig d.attrs['chrom-names'] = [s.encode('utf-8') for s in nc.get_chromorder(assembly)]

Here is the full thing:

Traceback (most recent call last): File "/home/rafa/programs/miniconda2/bin/clodius", line 11, in <module> load_entry_point('clodius==0.6.5', 'console_scripts', 'clodius')() File "/home/rafa/programs/miniconda2/lib/python2.7/site-packages/click/core.py", line 722, in __call__ return self.main(*args, **kwargs) File "/home/rafa/programs/miniconda2/lib/python2.7/site-packages/click/core.py", line 697, in main rv = self.invoke(ctx) File "/home/rafa/programs/miniconda2/lib/python2.7/site-packages/click/core.py", line 1066, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/rafa/programs/miniconda2/lib/python2.7/site-packages/click/core.py", line 1066, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/rafa/programs/miniconda2/lib/python2.7/site-packages/click/core.py", line 895, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/rafa/programs/miniconda2/lib/python2.7/site-packages/click/core.py", line 535, in invoke return callback(*args, **kwargs) File "/home/rafa/programs/miniconda2/lib/python2.7/site-packages/clodius/cli/aggregate.py", line 1102, in bigwig _bigwig(filepath, chunk_size, zoom_step, tile_size, output_file, assembly, chromsizes_filename, chromosome) File "/home/rafa/programs/miniconda2/lib/python2.7/site-packages/clodius/cli/aggregate.py", line 575, in _bigwig d.attrs['chrom-names'] = [s.encode('utf-8') for s in nc.get_chromorder(assembly)] File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/tmp/pip-nCYoKW-build/h5py/_objects.c:2840) File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/tmp/pip-nCYoKW-build/h5py/_objects.c:2798) File "/home/rafa/programs/miniconda2/lib/python2.7/site-packages/h5py/_hl/attrs.py", line 93, in __setitem__ self.create(name, data=value, dtype=base.guess_dtype(value)) File "/home/rafa/programs/miniconda2/lib/python2.7/site-packages/h5py/_hl/attrs.py", line 188, in create attr = h5a.create(self._id, self._e(tempname), htype, space) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/tmp/pip-nCYoKW-build/h5py/_objects.c:2840) File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/tmp/pip-nCYoKW-build/h5py/_objects.c:2798) File "h5py/h5a.pyx", line 47, in h5py.h5a.create (/tmp/pip-nCYoKW-build/h5py/h5a.c:2069) RuntimeError: Unable to create attribute (Object header message is too large)

Maybe a solution based on this one for cooler files will do the trick:

https://github.com/mirnylab/cooler/issues/76

Thank you so much for your efforts!!