saezlab / progeny-py

PROGENY Python implementation
MIT License
12 stars 3 forks source link

Mouse #3

Closed terooatt closed 3 years ago

terooatt commented 3 years ago

Thank you for this package.

I have no problem loading the human regulon but the mouse gives me the following error.

In [4]: dorothea_hs = dorothea.load_regulons( ...: organism='Mouse', # If working with mouse, set to Mouse ...: commercial=False # If non-academia, set to True ...: ) ...: ...: dorothea_hs


ValueError Traceback (most recent call last)

in 1 dorothea_hs = dorothea.load_regulons( 2 organism='Mouse', # If working with mouse, set to Mouse ----> 3 commercial=False # If non-academia, set to True 4 ) 5 ~/miniconda3/envs/scanpy/lib/python3.7/site-packages/dorothea/dorothea.py in load_regulons(levels, organism, commercial) 57 58 # Transform to binary dataframe ---> 59 dorothea_df = df.pivot(index='target', columns='tf', values='mor') 60 61 # Set nans to 0 ~/.local/lib/python3.7/site-packages/pandas/core/frame.py in pivot(self, index, columns, values) 6672 from pandas.core.reshape.pivot import pivot 6673 -> 6674 return pivot(self, index=index, columns=columns, values=values) 6675 6676 _shared_docs[ ~/.local/lib/python3.7/site-packages/pandas/core/reshape/pivot.py in pivot(data, index, columns, values) 475 else: 476 indexed = data._constructor_sliced(data[values]._values, index=index) --> 477 return indexed.unstack(columns) 478 479 ~/.local/lib/python3.7/site-packages/pandas/core/series.py in unstack(self, level, fill_value) 3888 from pandas.core.reshape.reshape import unstack 3889 -> 3890 return unstack(self, level, fill_value) 3891 3892 # ---------------------------------------------------------------------- ~/.local/lib/python3.7/site-packages/pandas/core/reshape/reshape.py in unstack(obj, level, fill_value) 423 return _unstack_extension_series(obj, level, fill_value) 424 unstacker = _Unstacker( --> 425 obj.index, level=level, constructor=obj._constructor_expanddim, 426 ) 427 return unstacker.get_result( ~/.local/lib/python3.7/site-packages/pandas/core/reshape/reshape.py in __init__(self, index, level, constructor) 118 raise ValueError("Unstacked DataFrame is too big, causing int32 overflow") 119 --> 120 self._make_selectors() 121 122 @cache_readonly ~/.local/lib/python3.7/site-packages/pandas/core/reshape/reshape.py in _make_selectors(self) 167 168 if mask.sum() < len(self.index): --> 169 raise ValueError("Index contains duplicate entries, cannot reshape") 170 171 self.group_index = comp_index ValueError: Index contains duplicate entries, cannot reshape
PauBadiaM commented 3 years ago

Hi @terooatt, thanks for trying it out! Good catch! I'm looking into it. It looks like the problem is only when using all the confidence levels of the Mouse regulons. In the meantime, you can constrain the levels up to level D:

import dorothea
dorothea.load_regulons(levels=['A', 'B', 'C', 'D'], organism='Mouse')

This should work. By the way, it would be better if you opened this issue on the dorothea github page, this one is about progeny. Therefore I will close this issue here.

terooatt commented 3 years ago

My bad! Thank you very much it worked.