aertslab / pySCENIC

pySCENIC is a lightning-fast python implementation of the SCENIC pipeline (Single-Cell rEgulatory Network Inference and Clustering) which enables biologists to infer transcription factors, gene regulatory networks and cell types from single-cell RNA-seq data.
http://scenic.aertslab.org
GNU General Public License v3.0
439 stars 181 forks source link

[BUG]TypeError: Object dtype dtype('O') has no native HDF5 equivalent #173

Open KabitaBaral1 opened 4 years ago

KabitaBaral1 commented 4 years ago

Hi,

I am running pySCENIC following this vignette: https://github.com/aertslab/pySCENIC/blob/master/notebooks/pySCENIC%20-%20Integration%20with%20scanpy.ipynb

Everything runs smoothly until the step of creating loom file, that is, the following snippet of code: export2loom(df_tpm.T, regulons, LOOM_FNAME, cell_annotations=adata.obs['cell_type'].to_dict(), tree_structure=(), title='Schwann', nomenclature="HGNC", compress=True)

I get the following error message: Regulon name does not seem to be compatible with SCOPE. It should include a space to allow selection of the TF. Please run: regulons = [r.rename(r.name.replace('(+)',' ('+str(len(r))+'g)')) for r in regulons] or: regulons = [r.rename(r.name.replace('(',' (')) for r in regulons] 2020-05-19 18:47:22,740 - pyscenic.recovery - WARNING - Less than 80% of the genes in PHLDA2 are present in the expression matrix.

2020-05-19 18:47:22,740 - pyscenic.recovery - WARNING - Less than 80% of the genes in ETS1 are present in the expression matrix.


TypeError Traceback (most recent call last)

in 3 tree_structure=(), 4 title='Schwann', ----> 5 nomenclature="HGNC", compress=True) /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pyscenic/export.py in export2loom(ex_mtx, regulons, out_fname, cell_annotations, tree_structure, title, nomenclature, num_workers, embeddings, auc_mtx, auc_thresholds, compress) 190 row_attrs=row_attrs, 191 col_attrs=column_attrs, --> 192 file_attrs=general_attrs) 193 194 #TODO: remove duplication with export2loom function! /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/loompy/loompy.py in create(filename, layers, row_attrs, col_attrs, file_attrs) 1066 1067 for key, vals in col_attrs.items(): -> 1068 ds.ca[key] = vals 1069 1070 except ValueError as ve: /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/loompy/attribute_manager.py in __setitem__(self, name, val) 127 Set the value of a named attribute 128 """ --> 129 return self.__setattr__(name, val) 130 131 def __setattr__(self, name: str, val: np.ndarray) -> None: /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/loompy/attribute_manager.py in __setattr__(self, name, val) 157 self.ds._file.create_dataset(a + name, data=values, dtype=h5py.special_dtype(vlen=str)) 158 else: --> 159 self.ds._file[a + name] = values 160 self.ds._file[a + name].attrs["last_modified"] = timestamp() 161 self.ds._file[a].attrs["last_modified"] = timestamp() /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/h5py/_hl/group.py in __setitem__(self, name, obj) 385 386 else: --> 387 ds = self.create_dataset(None, data=obj, dtype=base.guess_dtype(obj)) 388 h5o.link(ds.id, self.id, name, lcpl=lcpl) 389 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/h5py/_hl/group.py in create_dataset(self, name, shape, dtype, data, **kwds) 134 135 with phil: --> 136 dsid = dataset.make_new_dset(self, shape, dtype, data, **kwds) 137 dset = dataset.Dataset(dsid) 138 if name is not None: /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/h5py/_hl/dataset.py in make_new_dset(parent, shape, dtype, data, chunks, compression, shuffle, fletcher32, maxshape, compression_opts, fillvalue, scaleoffset, track_times, external, track_order, dcpl) 116 else: 117 dtype = numpy.dtype(dtype) --> 118 tid = h5t.py_create(dtype, logical=1) 119 120 # Legacy h5py/h5t.pyx in h5py.h5t.py_create() h5py/h5t.pyx in h5py.h5t.py_create() h5py/h5t.pyx in h5py.h5t.py_create() h5py/h5t.pyx in h5py.h5t._c_compound() h5py/h5t.pyx in h5py.h5t.py_create() h5py/h5t.pyx in h5py.h5t.py_create() TypeError: Object dtype dtype('O') has no native HDF5 equivalent
cflerin commented 4 years ago

Please complete the following information from the issue template. It's quite difficult to provide any help without this.

KabitaBaral1 commented 4 years ago

Hi, pySCENIC verision I used: 0.10.0+9.gceaad7d Installation method: Conda Run environment: Jupyter notebook except grn and ctx were run in CLI OS: MacOS Catalina 10.15.2 Package versions:

Name Version Build Channel

anndata 0.7.1 py36_0 conda-forge arboreto 0.1.5 pypi_0 pypi attrs 19.3.0 pypi_0 pypi blosc 1.18.1 h4a8c4bd_0 conda-forge bokeh 2.0.1 py36h9f0ad1d_0 conda-forge boltons 20.1.0 pypi_0 pypi bzip2 1.0.8 h0b31af3_2 conda-forge ca-certificates 2020.4.5.1 hecc5488_0 conda-forge cairo 1.16.0 hec6a9b0_1003 conda-forge certifi 2020.4.5.1 py36h9f0ad1d_0 conda-forge cffi 1.14.0 py36h356ff06_0 conda-forge click 7.1.1 pyh8c360ce_0 conda-forge cloudpickle 1.4.0.dev0 pypi_0 pypi cycler 0.10.0 py_2 conda-forge cytoolz 0.10.1 py36h0b31af3_0 conda-forge dask 1.0.0 pypi_0 pypi decorator 4.4.2 py_0 conda-forge dill 0.3.1.1 pypi_0 pypi distributed 1.0.0 pypi_0 pypi fontconfig 2.13.1 h6b1039f_1001 conda-forge freetype 2.10.1 h8da9a1a_0 conda-forge frozendict 1.2 pypi_0 pypi fsspec 0.7.2 py_0 conda-forge gettext 0.19.8.1 h46ab8bc_1002 conda-forge glib 2.58.3 py36hb0ce7ff_1004 conda-forge gmp 6.2.0 h4a8c4bd_2 conda-forge graphviz 2.42.3 h98dfb87_0 conda-forge h5py 2.10.0 nompi_py36h106b333_102 conda-forge hdf5 1.10.5 nompi_h3e39495_1104 conda-forge heapdict 1.0.1 py_0 conda-forge icu 64.2 h6de7cb9_1 conda-forge igraph 0.7.1 h91b20c2_1007 conda-forge importlib-metadata 1.6.0 py36h9f0ad1d_0 conda-forge importlib_metadata 1.6.0 0 conda-forge interlap 0.2.6 pypi_0 pypi jinja2 2.11.1 py_0 conda-forge joblib 0.14.1 py_0 conda-forge jpeg 9c h1de35cc_1001 conda-forge kiwisolver 1.2.0 py36h863e41a_0 conda-forge libblas 3.8.0 16_openblas conda-forge libcblas 3.8.0 16_openblas conda-forge libcxx 10.0.0 0 conda-forge libffi 3.2.1 h4a8c4bd_1007 conda-forge libgfortran 4.0.0 2 conda-forge libiconv 1.15 h0b31af3_1006 conda-forge liblapack 3.8.0 16_openblas conda-forge libllvm8 8.0.1 h770b8ee_0 conda-forge libopenblas 0.3.9 h3d69b6c_0 conda-forge libpng 1.6.37 hbbe82c9_1 conda-forge libtiff 4.1.0 h2ae36a8_6 conda-forge libwebp-base 1.1.0 h0b31af3_3 conda-forge libxml2 2.9.10 h53d96d6_0 conda-forge llvm-openmp 9.0.1 h28b9765_2 conda-forge llvmlite 0.31.0 py36hde82470_1 conda-forge locket 0.2.0 py_2 conda-forge loompy 3.0.6 pypi_0 pypi louvain 0.6.1 py36h4a8c4bd_2 conda-forge lz4-c 1.8.3 h6de7cb9_1001 conda-forge markupsafe 1.1.1 py36h37b9a7d_1 conda-forge matplotlib 3.2.1 0 conda-forge matplotlib-base 3.2.1 py36h83d3ec1_0 conda-forge mock 3.0.5 py36h9f0ad1d_1 conda-forge msgpack 0.6.1 pypi_0 pypi msgpack-python 0.5.6 pypi_0 pypi multicore-tsne 0.1_d4ff4aab py36h3e44d54_0 conda-forge multicoretsne 0.1 pypi_0 pypi multiprocessing-on-dill 3.5.0a4 pypi_0 pypi natsort 7.0.1 py_0 conda-forge ncurses 6.1 h0a44026_1002 conda-forge networkx 2.4 py_1 conda-forge numba 0.48.0 py36h4f17bb1_0 conda-forge numexpr 2.7.1 py36hcc1bba6_1 conda-forge numpy 1.18.1 py36hdc5ca10_1 conda-forge numpy-groupies 0+unknown pypi_0 pypi olefile 0.46 py_0 conda-forge openssl 1.1.1f h0b31af3_0 conda-forge packaging 20.1 py_0 conda-forge pandas 0.25.3 pypi_0 pypi partd 1.1.0 py_0 conda-forge patsy 0.5.1 py_0 conda-forge pcre 8.44 h4a8c4bd_0 conda-forge pillow 7.1.1 py36h2ae5dfa_0 conda-forge pip 20.0.2 py_2 conda-forge pixman 0.38.0 h01d97ff_1003 conda-forge psutil 5.7.0 py36h37b9a7d_1 conda-forge pyarrow 0.16.0 pypi_0 pypi pycairo 1.19.1 py36h1ef2672_3 conda-forge pycparser 2.20 py_0 conda-forge pyparsing 2.4.7 pyh9f0ad1d_0 conda-forge pyscenic 0.10.0+9.gceaad7d pypi_0 pypi pytables 3.6.1 py36h6f8395a_1 conda-forge python 3.6.10 h4334963_1010_cpython conda-forge python-dateutil 2.8.1 py_0 conda-forge python-graphviz 0.13.2 py_0 conda-forge python-igraph 0.8.0 py36hffd003b_1 conda-forge python_abi 3.6 1_cp36m conda-forge pytz 2019.3 py_0 conda-forge pyyaml 5.3.1 py36h37b9a7d_0 conda-forge readline 8.0 hcfe32e1_0 conda-forge scanpy 1.4.4.post1 py_0 bioconda scikit-learn 0.22.2.post1 py36h3dc85bc_0 conda-forge scipy 1.4.1 py36h1dac7e4_2 conda-forge seaborn 0.10.0 py_1 conda-forge setuptools 46.1.3 py36h9f0ad1d_0 conda-forge six 1.14.0 py_1 conda-forge sklearn 0.0 pypi_0 pypi snakeviz 2.0.1 py_0 conda-forge sortedcontainers 2.1.0 py_0 conda-forge sqlite 3.30.1 h93121df_0 conda-forge statsmodels 0.11.1 py36h37b9a7d_1 conda-forge tbb 2019.0 pypi_0 pypi tblib 1.6.0 py_0 conda-forge texttable 1.6.2 py_0 conda-forge tk 8.6.10 hbbe82c9_0 conda-forge toolz 0.10.0 py_0 conda-forge tornado 6.0.3 pypi_0 pypi tqdm 4.45.0 pyh9f0ad1d_0 conda-forge typing 3.6.4 py36_0 conda-forge typing_extensions 3.7.4.1 py36h9f0ad1d_3 conda-forge umap-learn 0.4.0 py36h9f0ad1d_0 conda-forge wheel 0.34.2 py_1 conda-forge xz 5.2.5 h0b31af3_0 conda-forge yaml 0.2.2 h0b31af3_1 conda-forge zict 2.0.0 py_0 conda-forge zipp 3.1.0 py_0 conda-forge zlib 1.2.11 h0b31af3_1006 conda-forge zstd 1.4.4 hed8d7c8_2 conda-forge

Thank you

cflerin commented 4 years ago

Thanks for the additional information. First, I would run one of the suggested commands from this warning (if you care about SCope viewer compatibility):

Regulon name does not seem to be compatible with SCOPE. It should include a space to allow selection of the TF.
Please run:
regulons = [r.rename(r.name.replace('(+)',' ('+str(len(r))+'g)')) for r in regulons]
or:
regulons = [r.rename(r.name.replace('(',' (')) for r in regulons]

Second, try downgrading your h5py package. Yours looks to be version 2.10.0 which I recall having some issues with. Try h5py==2.9.0 instead.

KabitaBaral1 commented 4 years ago

Hi,

I tried it all and still get the regulon message and the same error. Not sure if it is relevant, but right after the regulon name error I get errors like the following enclosed in '''. I get this error for like 20 genes. ''' 2020-05-20 15:15:15,932 - pyscenic.recovery - WARNING - Less than 80% of the genes in IRF9 are present in the expression matrix.

2020-05-20 15:15:15,936 - pyscenic.recovery - WARNING - Less than 80% of the genes in PAX3 are present in the expression matrix. '''

Regulon name does not seem to be compatible with SCOPE. It should include a space to allow selection of the TF. Please run: regulons = [r.rename(r.name.replace('(+)',' ('+str(len(r))+'g)')) for r in regulons] or: regulons = [r.rename(r.name.replace('(',' (')) for r in regulons]


TypeError Traceback (most recent call last)

in 3 tree_structure=(), 4 title='Schwann', ----> 5 nomenclature="HGNC", compress=True) /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pyscenic/export.py in export2loom(ex_mtx, regulons, out_fname, cell_annotations, tree_structure, title, nomenclature, num_workers, embeddings, auc_mtx, auc_thresholds, compress) 190 row_attrs=row_attrs, 191 col_attrs=column_attrs, --> 192 file_attrs=general_attrs) 193 194 #TODO: remove duplication with export2loom function! /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/loompy/loompy.py in create(filename, layers, row_attrs, col_attrs, file_attrs) 1066 1067 for key, vals in col_attrs.items(): -> 1068 ds.ca[key] = vals 1069 1070 except ValueError as ve: /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/loompy/attribute_manager.py in __setitem__(self, name, val) 127 Set the value of a named attribute 128 """ --> 129 return self.__setattr__(name, val) 130 131 def __setattr__(self, name: str, val: np.ndarray) -> None: /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/loompy/attribute_manager.py in __setattr__(self, name, val) 157 self.ds._file.create_dataset(a + name, data=values, dtype=h5py.special_dtype(vlen=str)) 158 else: --> 159 self.ds._file[a + name] = values 160 self.ds._file[a + name].attrs["last_modified"] = timestamp() 161 self.ds._file[a].attrs["last_modified"] = timestamp() /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/h5py/_hl/group.py in __setitem__(self, name, obj) 385 386 else: --> 387 ds = self.create_dataset(None, data=obj, dtype=base.guess_dtype(obj)) 388 h5o.link(ds.id, self.id, name, lcpl=lcpl) 389 /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/h5py/_hl/group.py in create_dataset(self, name, shape, dtype, data, **kwds) 134 135 with phil: --> 136 dsid = dataset.make_new_dset(self, shape, dtype, data, **kwds) 137 dset = dataset.Dataset(dsid) 138 if name is not None: /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/h5py/_hl/dataset.py in make_new_dset(parent, shape, dtype, data, chunks, compression, shuffle, fletcher32, maxshape, compression_opts, fillvalue, scaleoffset, track_times, external, track_order, dcpl) 116 else: 117 dtype = numpy.dtype(dtype) --> 118 tid = h5t.py_create(dtype, logical=1) 119 120 # Legacy h5py/h5t.pyx in h5py.h5t.py_create() h5py/h5t.pyx in h5py.h5t.py_create() h5py/h5t.pyx in h5py.h5t.py_create() h5py/h5t.pyx in h5py.h5t._c_compound() h5py/h5t.pyx in h5py.h5t.py_create() h5py/h5t.pyx in h5py.h5t.py_create() TypeError: Object dtype dtype('O') has no native HDF5 equivalent
cflerin commented 4 years ago

First, the messages about "Less than 80% of the genes in ..." are just warnings and are not errors.

I'd have to test the export2loom function a bit more thoroughly to figure out why it's giving this error. But alternatively, if you want to create a loom file, there is a section in this tutorial which covers this (although it uses a Scanpy object, which may not be helpful here).