aertslab / pySCENIC

pySCENIC is a lightning-fast python implementation of the SCENIC pipeline (Single-Cell rEgulatory Network Inference and Clustering) which enables biologists to infer transcription factors, gene regulatory networks and cell types from single-cell RNA-seq data.
http://scenic.aertslab.org
GNU General Public License v3.0
445 stars 183 forks source link

[BUG] grnboost2 TypeError: Must supply at least one delayed object #561

Open AAA-3 opened 4 months ago

AAA-3 commented 4 months ago

Describe the bug

Running GRNBoost2 produces an error at the very last step. Reproducible with the tutorial (https://pyscenic.readthedocs.io/en/latest/tutorial.html) using the GSE60361_C1-3005-Expression.txt dataset.

Installing arboreto from source did not help

Steps to reproduce the behavior

  1. Command run when the error occurred:
    
    import os
    import glob
    import pickle
    import pandas as pd
    import numpy as np
    from dask.diagnostics import ProgressBar
    from arboreto.utils import load_tf_names
    from arboreto.algo import grnboost2
    from ctxcore.rnkdb import FeatherRankingDatabase as RankingDatabase
    from pyscenic.utils import modules_from_adjacencies, load_motifs
    from pyscenic.prune import prune2df, df2regulons
    from pyscenic.aucell import aucell
    import seaborn as sns

DATA_FOLDER="/home/ali/Dokumente/15d _Organoids_Multiome/Separate/SCENIC/CommittedMesoderm" RESOURCES_FOLDER="/home/ali/Dokumente/15d _Organoids_Multiome/Separate/SCENIC/pySCENIC" MATRIX_FOLDER="/home/ali/Dokumente/15d _Organoids_Multiome/Separate/SCENIC/pySCENIC/Matrix" SCHEDULER="123.122.8.24:8786" DATABASES_GLOB = os.path.join(RESOURCESFOLDER, "hg38*.genes_vs_motifs.rankings.feather") MOTIF_ANNOTATIONS_FNAME = os.path.join(RESOURCES_FOLDER, "motifs-v10nr_clust-nr.hgnc-m0.001-o0.0.tbl") MM_TFS_FNAME = '/home/ali/pySCENIC/resources/hs_hgnc_tfs.txt' SC_EXP_FNAME = os.path.join(MATRIX_FOLDER, "Cardiac-Cimmitted Cells_matrix.csv") REGULONS_FNAME = os.path.join(DATA_FOLDER, "regulons.p") MOTIFS_FNAME = os.path.join(DATA_FOLDER, "motifs.csv")

ex_matrix = pd.read_csv(SC_EXP_FNAME, sep=',', header=0, index_col=0).T tf_names = load_tf_names(MM_TFS_FNAME) db_fnames = glob.glob(DATABASES_GLOB) def name(fname): return os.path.splitext(os.path.basename(fname))[0] dbs = [RankingDatabase(fname=fname, name=name(fname)) for fname in db_fnames] adjacencies = grnboost2(ex_matrix, tf_names=tf_names, verbose=True) ...


2. Error encountered:
<!-- Please specify the **complete** error message (if applicable, otherwise delete this block): -->
```pytb
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[17], line 2
      1 #single cell expression profiles are used to infer co-expression modules
----> 2 adjacencies = grnboost2(ex_matrix, tf_names=tf_names, verbose=True)

File ~/anaconda3/envs/pySCENIC/lib/python3.12/site-packages/arboreto/algo.py:39, in grnboost2(expression_data, gene_names, tf_names, client_or_address, early_stop_window_length, limit, seed, verbose)
     10 def grnboost2(expression_data,
     11               gene_names=None,
     12               tf_names='all',
   (...)
     16               seed=None,
     17               verbose=False):
     18     """
     19     Launch arboreto with [GRNBoost2] profile.
     20 
   (...)
     36     :return: a pandas DataFrame['TF', 'target', 'importance'] representing the inferred gene regulatory links.
     37     """
---> 39     return diy(expression_data=expression_data, regressor_type='GBM', regressor_kwargs=SGBM_KWARGS,
     40                gene_names=gene_names, tf_names=tf_names, client_or_address=client_or_address,
     41                early_stop_window_length=early_stop_window_length, limit=limit, seed=seed, verbose=verbose)

File ~/anaconda3/envs/pySCENIC/lib/python3.12/site-packages/arboreto/algo.py:120, in diy(expression_data, regressor_type, regressor_kwargs, gene_names, tf_names, client_or_address, early_stop_window_length, limit, seed, verbose)
    117 if verbose:
    118     print('creating dask graph')
--> 120 graph = create_graph(expression_matrix,
    121                      gene_names,
    122                      tf_names,
    123                      client=client,
    124                      regressor_type=regressor_type,
    125                      regressor_kwargs=regressor_kwargs,
    126                      early_stop_window_length=early_stop_window_length,
    127                      limit=limit,
    128                      seed=seed)
    130 if verbose:
    131     print('{} partitions'.format(graph.npartitions))

File ~/anaconda3/envs/pySCENIC/lib/python3.12/site-packages/arboreto/core.py:450, in create_graph(expression_matrix, gene_names, tf_names, regressor_type, regressor_kwargs, client, target_genes, limit, include_meta, early_stop_window_length, repartition_multiplier, seed)
    448 # gather the DataFrames into one distributed DataFrame
    449 all_links_df = from_delayed(delayed_link_dfs, meta=_GRN_SCHEMA)
--> 450 all_meta_df = from_delayed(delayed_meta_dfs, meta=_META_SCHEMA)
    452 # optionally limit the number of resulting regulatory links, descending by top importance
    453 if limit:

File ~/anaconda3/envs/pySCENIC/lib/python3.12/site-packages/dask_expr/io/_delayed.py:115, in from_delayed(dfs, meta, divisions, prefix, verify_meta)
    112     dfs = [dfs]
    114 if len(dfs) == 0:
--> 115     raise TypeError("Must supply at least one delayed object")
    117 if meta is None:
    118     meta = delayed(make_meta)(dfs[0]).compute()

TypeError: Must supply at least one delayed object
HowieJM commented 3 months ago

Hi, I'd just add that I also have this error on a run I tried today. In my case, Linux virtual machine, installed pyscenic via pip, and running the command line rather than the Jupyter approach.

Also note that I tried with the arboreto_with_multiprocessing.py [the command otherwise identical] as well, which seems to resolve, though rather slow. For clarity, the error generating code is below:

Basic Env:

conda create -y -n pyscenic python=3.10
pip install pyscenic
pip install 'numpy<1.24'
pip install scanpy 

Command:

pyscenic grn \
    --output adj.tsv \
    --num_workers 4 \
    --method grnboost2 \
    SG1.loom \
    allTFs_hg38.txt 

Error:

2024-07-29 14:14:40,993 - pyscenic.cli.pyscenic - INFO - Loading expression matrix.

2024-07-29 14:15:09,186 - pyscenic.cli.pyscenic - INFO - Inferring regulatory networks.
preparing dask client
parsing input
creating dask graph
not shutting down client, client was created externally
finished
Traceback (most recent call last):
  File "/home/james/miniconda3/envs/pyscenic_standalone/bin/pyscenic", line 8, in <module>
    sys.exit(main())
  File "/home/james/miniconda3/envs/pyscenic_standalone/lib/python3.10/site-packages/pyscenic/cli/pyscenic.py", line 713, in main
    args.func(args)
  File "/home/james/miniconda3/envs/pyscenic_standalone/lib/python3.10/site-packages/pyscenic/cli/pyscenic.py", line 106, in find_adjacencies_command
    network = method(
  File "/home/james/miniconda3/envs/pyscenic_standalone/lib/python3.10/site-packages/arboreto/algo.py", line 39, in grnboost2
    return diy(expression_data=expression_data, regressor_type='GBM', regressor_kwargs=SGBM_KWARGS,
  File "/home/james/miniconda3/envs/pyscenic_standalone/lib/python3.10/site-packages/arboreto/algo.py", line 120, in diy
    graph = create_graph(expression_matrix,
  File "/home/james/miniconda3/envs/pyscenic_standalone/lib/python3.10/site-packages/arboreto/core.py", line 450, in create_graph
    all_meta_df = from_delayed(delayed_meta_dfs, meta=_META_SCHEMA)
  File "/home/james/miniconda3/envs/pyscenic_standalone/lib/python3.10/site-packages/dask_expr/io/_delayed.py", line 115, in from_delayed
    raise TypeError("Must supply at least one delayed object")
TypeError: Must supply at least one delayed object
HowieJM commented 3 months ago

Solution: hi, I just thought I'd post a solution. I found it elsewhere in one of the replies to someone else's question:

pip install dask-expr==0.5.3 distributed==2024.2.1 #deal with DASK issues due to recent updates to DASK

I ran this and then was able to run the standard command rather than the alternative, and it worked. Cheers!

dakota-hawkins commented 3 months ago

Solution: hi, I just thought I'd post a solution. I found it elsewhere in one of the replies to someone else's question:

pip install dask-expr==0.5.3 distributed==2024.2.1 #deal with DASK issues due to recent updates to DASK

I ran this and then was able to run the standard command rather than the alternative, and it worked. Cheers!

I ran into the same issue, and this seemed to fix it. Thanks!

LenisLin commented 2 months ago

Solution: hi, I just thought I'd post a solution. I found it elsewhere in one of the replies to someone else's question:

pip install dask-expr==0.5.3 distributed==2024.2.1 #deal with DASK issues due to recent updates to DASK

I ran this and then was able to run the standard command rather than the alternative, and it worked. Cheers!

it works, thanks a lot !!!

anna4kaa commented 1 month ago

Solution: hi, I just thought I'd post a solution. I found it elsewhere in one of the replies to someone else's question:

pip install dask-expr==0.5.3 distributed==2024.2.1 #deal with DASK issues due to recent updates to DASK

I ran this and then was able to run the standard command rather than the alternative, and it worked. Cheers!

Unfortunately, it didn't work for me or at least it lead to another issue when importing grnboost2:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[1], line 4
      1 import os
      2 import pandas as pd
----> 4 from arboreto.algo import grnboost2
      5 from arboreto.utils import load_tf_names

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/arboreto/algo.py:7
      5 import pandas as pd
      6 from distributed import Client, LocalCluster
----> 7 from arboreto.core import create_graph, SGBM_KWARGS, RF_KWARGS, EARLY_STOP_WINDOW_LENGTH
     10 def grnboost2(expression_data,
     11               gene_names=None,
     12               tf_names='all',
   (...)
     16               seed=None,
     17               verbose=False):
     18     """
     19     Launch arboreto with [GRNBoost2] profile.
     20 
   (...)
     36     :return: a pandas DataFrame['TF', 'target', 'importance'] representing the inferred gene regulatory links.
     37     """

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/arboreto/core.py:12
     10 from sklearn.ensemble import GradientBoostingRegressor, RandomForestRegressor, ExtraTreesRegressor
     11 from dask import delayed
---> 12 from dask.dataframe import from_delayed
     13 from dask.dataframe.utils import make_meta
     15 logger = logging.getLogger(__name__)

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/__init__.py:100
     98 import dask.dataframe._pyarrow_compat
     99 from dask.base import compute
--> 100 from dask.dataframe import backends, dispatch, rolling
    101 from dask.dataframe.core import (
    102     DataFrame,
    103     Index,
   (...)
    109     to_timedelta,
    110 )
    111 from dask.dataframe.groupby import Aggregation

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/backends.py:15
     13 from dask.backends import CreationDispatch, DaskBackendEntrypoint
     14 from dask.dataframe._compat import PANDAS_GE_220, is_any_real_numeric_dtype
---> 15 from dask.dataframe.core import DataFrame, Index, Scalar, Series, _Frame
     16 from dask.dataframe.dispatch import (
     17     categorical_dtype_dispatch,
     18     concat,
   (...)
     35     union_categoricals_dispatch,
     36 )
     37 from dask.dataframe.extensions import make_array_nonempty, make_scalar

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/core.py:36
     34 from dask.blockwise import Blockwise, BlockwiseDep, BlockwiseDepDict, blockwise
     35 from dask.context import globalmethod
---> 36 from dask.dataframe import methods
     37 from dask.dataframe._compat import (
     38     PANDAS_GE_140,
     39     PANDAS_GE_150,
   (...)
     47     is_string_dtype,
     48 )
     49 from dask.dataframe.accessor import CachedAccessor, DatetimeAccessor, StringAccessor

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/methods.py:34
     22 #  preserve compatibility while moving dispatch objects
     23 from dask.dataframe.dispatch import (  # noqa: F401
     24     concat,
     25     concat_dispatch,
   (...)
     32     union_categoricals,
     33 )
---> 34 from dask.dataframe.utils import is_dataframe_like, is_index_like, is_series_like
     35 from dask.utils import _deprecated_kwarg
     37 # cuDF may try to import old dispatch functions

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/utils.py:20
     18 from dask.base import get_scheduler, is_dask_collection
     19 from dask.core import get_deps
---> 20 from dask.dataframe import (  # noqa: F401 register pandas extension types
     21     _dtypes,
     22     methods,
     23 )
     24 from dask.dataframe._compat import PANDAS_GE_150, tm  # noqa: F401
     25 from dask.dataframe.dispatch import (  # noqa : F401
     26     make_meta,
     27     make_meta_obj,
     28     meta_nonempty,
     29 )

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/_dtypes.py:9
      6 import pandas as pd
      8 from dask.dataframe._compat import PANDAS_GE_150
----> 9 from dask.dataframe.extensions import make_array_nonempty, make_scalar
     12 @make_array_nonempty.register(pd.DatetimeTZDtype)
     13 def _(dtype):
     14     return pd.array([pd.Timestamp(1), pd.NaT], dtype=dtype)

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/extensions.py:8
      1 """
      2 Support for pandas ExtensionArray in dask.dataframe.
      3 
      4 See :ref:`extensionarrays` for more.
      5 """
      6 from __future__ import annotations
----> 8 from dask.dataframe.accessor import (
      9     register_dataframe_accessor,
     10     register_index_accessor,
     11     register_series_accessor,
     12 )
     13 from dask.utils import Dispatch
     15 make_array_nonempty = Dispatch("make_array_nonempty")

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/accessor.py:126
    113         token = f"{self._accessor_name}-{attr}"
    114         return self._series.map_partitions(
    115             self._delegate_method,
    116             self._accessor_name,
   (...)
    122             token=token,
    123         )
--> 126 class DatetimeAccessor(Accessor):
    127     """Accessor object for datetimelike properties of the Series values.
    128 
    129     Examples
   (...)
    132     >>> s.dt.microsecond  # doctest: +SKIP
    133     """
    135     _accessor_name = "dt"

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/accessor.py:81, in Accessor.__init_subclass__(cls, **kwargs)
     79 attr, min_version = item if isinstance(item, tuple) else (item, None)
     80 if not hasattr(cls, attr):
---> 81     _bind_property(cls, pd_cls, attr, min_version)

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/accessor.py:35, in _bind_property(cls, pd_cls, attr, min_version)
     33 except Exception:
     34     pass
---> 35 setattr(cls, attr, property(derived_from(pd_cls, version=min_version)(func)))

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/utils.py:987, in derived_from.<locals>.wrapper(method)
    985 try:
    986     extra = getattr(method, "__doc__", None) or ""
--> 987     method.__doc__ = _derived_from(
    988         original_klass,
    989         method,
    990         ua_args=ua_args,
    991         extra=extra,
    992         skipblocks=skipblocks,
    993         inconsistencies=inconsistencies,
    994     )
    995     return method
    997 except AttributeError:

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/utils.py:940, in _derived_from(cls, method, ua_args, extra, skipblocks, inconsistencies)
    938 # Mark unsupported arguments
    939 try:
--> 940     method_args = get_named_args(method)
    941     original_args = get_named_args(original_method)
    942     not_supported = [m for m in original_args if m not in method_args]

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/utils.py:701, in get_named_args(func)
    699 def get_named_args(func) -> list[str]:
    700     """Get all non ``*args/**kwargs`` arguments for a function"""
--> 701     s = inspect.signature(func)
    702     return [
    703         n
    704         for n, p in s.parameters.items()
    705         if p.kind in [p.POSITIONAL_OR_KEYWORD, p.POSITIONAL_ONLY, p.KEYWORD_ONLY]
    706     ]

File ~/anaconda3/envs/arboreto/lib/python3.11/inspect.py:3263, in signature(obj, follow_wrapped, globals, locals, eval_str)
   3261 def signature(obj, *, follow_wrapped=True, globals=None, locals=None, eval_str=False):
   3262     """Get a signature object for the passed callable."""
-> 3263     return Signature.from_callable(obj, follow_wrapped=follow_wrapped,
   3264                                    globals=globals, locals=locals, eval_str=eval_str)

File ~/anaconda3/envs/arboreto/lib/python3.11/inspect.py:3011, in Signature.from_callable(cls, obj, follow_wrapped, globals, locals, eval_str)
   3007 @classmethod
   3008 def from_callable(cls, obj, *,
   3009                   follow_wrapped=True, globals=None, locals=None, eval_str=False):
   3010     """Constructs Signature for the given callable object."""
-> 3011     return _signature_from_callable(obj, sigcls=cls,
   3012                                     follow_wrapper_chains=follow_wrapped,
   3013                                     globals=globals, locals=locals, eval_str=eval_str)

File ~/anaconda3/envs/arboreto/lib/python3.11/inspect.py:2599, in _signature_from_callable(obj, follow_wrapper_chains, skip_bound_arg, globals, locals, eval_str, sigcls)
   2597     call = getattr_static(type(obj), '__call__', None)
   2598     if call is not None:
-> 2599         call = _descriptor_get(call, obj)
   2600         return _get_signature_of(call)
   2602 raise ValueError('callable {!r} is not supported by signature'.format(obj))

File ~/anaconda3/envs/arboreto/lib/python3.11/inspect.py:2432, in _descriptor_get(descriptor, obj)
   2430 if get is _sentinel:
   2431     return descriptor
-> 2432 return get(descriptor, obj, type(obj))

TypeError: descriptor '__call__' for 'type' objects doesn't apply to a 'property' object

Has anyone experienced this?

HowieJM commented 1 month ago
TypeError: descriptor '__call__' for 'type' objects doesn't apply to a 'property' object

Has anyone experienced this? .. ..

Hi Anna, I haven't come across this error, and I'm not a developer for pySCENIC. But, one solution that might work for you if you are comfortable to try a container approach would be to use a fork I made for the VSN pipeline implementation. This has the advantage that if you want to run multiple times and aggregate, you can also do that "automatically". But you can also just do a single run. Either is possible.

In case that interest you, I made an edit to their README to explain the changes + how to do it. The fork and readme are at: https://github.com/HowieJM/vsn-pipelines

In any case, good luck :)

1561-lzx commented 1 week ago

Solution: hi, I just thought I'd post a solution. I found it elsewhere in one of the replies to someone else's question: pip install dask-expr==0.5.3 distributed==2024.2.1 #deal with DASK issues due to recent updates to DASK I ran this and then was able to run the standard command rather than the alternative, and it worked. Cheers!

Unfortunately, it didn't work for me or at least it lead to another issue when importing grnboost2:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[1], line 4
      1 import os
      2 import pandas as pd
----> 4 from arboreto.algo import grnboost2
      5 from arboreto.utils import load_tf_names

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/arboreto/algo.py:7
      5 import pandas as pd
      6 from distributed import Client, LocalCluster
----> 7 from arboreto.core import create_graph, SGBM_KWARGS, RF_KWARGS, EARLY_STOP_WINDOW_LENGTH
     10 def grnboost2(expression_data,
     11               gene_names=None,
     12               tf_names='all',
   (...)
     16               seed=None,
     17               verbose=False):
     18     """
     19     Launch arboreto with [GRNBoost2] profile.
     20 
   (...)
     36     :return: a pandas DataFrame['TF', 'target', 'importance'] representing the inferred gene regulatory links.
     37     """

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/arboreto/core.py:12
     10 from sklearn.ensemble import GradientBoostingRegressor, RandomForestRegressor, ExtraTreesRegressor
     11 from dask import delayed
---> 12 from dask.dataframe import from_delayed
     13 from dask.dataframe.utils import make_meta
     15 logger = logging.getLogger(__name__)

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/__init__.py:100
     98 import dask.dataframe._pyarrow_compat
     99 from dask.base import compute
--> 100 from dask.dataframe import backends, dispatch, rolling
    101 from dask.dataframe.core import (
    102     DataFrame,
    103     Index,
   (...)
    109     to_timedelta,
    110 )
    111 from dask.dataframe.groupby import Aggregation

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/backends.py:15
     13 from dask.backends import CreationDispatch, DaskBackendEntrypoint
     14 from dask.dataframe._compat import PANDAS_GE_220, is_any_real_numeric_dtype
---> 15 from dask.dataframe.core import DataFrame, Index, Scalar, Series, _Frame
     16 from dask.dataframe.dispatch import (
     17     categorical_dtype_dispatch,
     18     concat,
   (...)
     35     union_categoricals_dispatch,
     36 )
     37 from dask.dataframe.extensions import make_array_nonempty, make_scalar

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/core.py:36
     34 from dask.blockwise import Blockwise, BlockwiseDep, BlockwiseDepDict, blockwise
     35 from dask.context import globalmethod
---> 36 from dask.dataframe import methods
     37 from dask.dataframe._compat import (
     38     PANDAS_GE_140,
     39     PANDAS_GE_150,
   (...)
     47     is_string_dtype,
     48 )
     49 from dask.dataframe.accessor import CachedAccessor, DatetimeAccessor, StringAccessor

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/methods.py:34
     22 #  preserve compatibility while moving dispatch objects
     23 from dask.dataframe.dispatch import (  # noqa: F401
     24     concat,
     25     concat_dispatch,
   (...)
     32     union_categoricals,
     33 )
---> 34 from dask.dataframe.utils import is_dataframe_like, is_index_like, is_series_like
     35 from dask.utils import _deprecated_kwarg
     37 # cuDF may try to import old dispatch functions

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/utils.py:20
     18 from dask.base import get_scheduler, is_dask_collection
     19 from dask.core import get_deps
---> 20 from dask.dataframe import (  # noqa: F401 register pandas extension types
     21     _dtypes,
     22     methods,
     23 )
     24 from dask.dataframe._compat import PANDAS_GE_150, tm  # noqa: F401
     25 from dask.dataframe.dispatch import (  # noqa : F401
     26     make_meta,
     27     make_meta_obj,
     28     meta_nonempty,
     29 )

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/_dtypes.py:9
      6 import pandas as pd
      8 from dask.dataframe._compat import PANDAS_GE_150
----> 9 from dask.dataframe.extensions import make_array_nonempty, make_scalar
     12 @make_array_nonempty.register(pd.DatetimeTZDtype)
     13 def _(dtype):
     14     return pd.array([pd.Timestamp(1), pd.NaT], dtype=dtype)

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/extensions.py:8
      1 """
      2 Support for pandas ExtensionArray in dask.dataframe.
      3 
      4 See :ref:`extensionarrays` for more.
      5 """
      6 from __future__ import annotations
----> 8 from dask.dataframe.accessor import (
      9     register_dataframe_accessor,
     10     register_index_accessor,
     11     register_series_accessor,
     12 )
     13 from dask.utils import Dispatch
     15 make_array_nonempty = Dispatch("make_array_nonempty")

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/accessor.py:126
    113         token = f"{self._accessor_name}-{attr}"
    114         return self._series.map_partitions(
    115             self._delegate_method,
    116             self._accessor_name,
   (...)
    122             token=token,
    123         )
--> 126 class DatetimeAccessor(Accessor):
    127     """Accessor object for datetimelike properties of the Series values.
    128 
    129     Examples
   (...)
    132     >>> s.dt.microsecond  # doctest: +SKIP
    133     """
    135     _accessor_name = "dt"

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/accessor.py:81, in Accessor.__init_subclass__(cls, **kwargs)
     79 attr, min_version = item if isinstance(item, tuple) else (item, None)
     80 if not hasattr(cls, attr):
---> 81     _bind_property(cls, pd_cls, attr, min_version)

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/accessor.py:35, in _bind_property(cls, pd_cls, attr, min_version)
     33 except Exception:
     34     pass
---> 35 setattr(cls, attr, property(derived_from(pd_cls, version=min_version)(func)))

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/utils.py:987, in derived_from.<locals>.wrapper(method)
    985 try:
    986     extra = getattr(method, "__doc__", None) or ""
--> 987     method.__doc__ = _derived_from(
    988         original_klass,
    989         method,
    990         ua_args=ua_args,
    991         extra=extra,
    992         skipblocks=skipblocks,
    993         inconsistencies=inconsistencies,
    994     )
    995     return method
    997 except AttributeError:

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/utils.py:940, in _derived_from(cls, method, ua_args, extra, skipblocks, inconsistencies)
    938 # Mark unsupported arguments
    939 try:
--> 940     method_args = get_named_args(method)
    941     original_args = get_named_args(original_method)
    942     not_supported = [m for m in original_args if m not in method_args]

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/utils.py:701, in get_named_args(func)
    699 def get_named_args(func) -> list[str]:
    700     """Get all non ``*args/**kwargs`` arguments for a function"""
--> 701     s = inspect.signature(func)
    702     return [
    703         n
    704         for n, p in s.parameters.items()
    705         if p.kind in [p.POSITIONAL_OR_KEYWORD, p.POSITIONAL_ONLY, p.KEYWORD_ONLY]
    706     ]

File ~/anaconda3/envs/arboreto/lib/python3.11/inspect.py:3263, in signature(obj, follow_wrapped, globals, locals, eval_str)
   3261 def signature(obj, *, follow_wrapped=True, globals=None, locals=None, eval_str=False):
   3262     """Get a signature object for the passed callable."""
-> 3263     return Signature.from_callable(obj, follow_wrapped=follow_wrapped,
   3264                                    globals=globals, locals=locals, eval_str=eval_str)

File ~/anaconda3/envs/arboreto/lib/python3.11/inspect.py:3011, in Signature.from_callable(cls, obj, follow_wrapped, globals, locals, eval_str)
   3007 @classmethod
   3008 def from_callable(cls, obj, *,
   3009                   follow_wrapped=True, globals=None, locals=None, eval_str=False):
   3010     """Constructs Signature for the given callable object."""
-> 3011     return _signature_from_callable(obj, sigcls=cls,
   3012                                     follow_wrapper_chains=follow_wrapped,
   3013                                     globals=globals, locals=locals, eval_str=eval_str)

File ~/anaconda3/envs/arboreto/lib/python3.11/inspect.py:2599, in _signature_from_callable(obj, follow_wrapper_chains, skip_bound_arg, globals, locals, eval_str, sigcls)
   2597     call = getattr_static(type(obj), '__call__', None)
   2598     if call is not None:
-> 2599         call = _descriptor_get(call, obj)
   2600         return _get_signature_of(call)
   2602 raise ValueError('callable {!r} is not supported by signature'.format(obj))

File ~/anaconda3/envs/arboreto/lib/python3.11/inspect.py:2432, in _descriptor_get(descriptor, obj)
   2430 if get is _sentinel:
   2431     return descriptor
-> 2432 return get(descriptor, obj, type(obj))

TypeError: descriptor '__call__' for 'type' objects doesn't apply to a 'property' object

Has anyone experienced this?

hello,I also have this question, do you resolve it?

HowieJM commented 14 hours ago

Solution: hi, I just thought I'd post a solution. I found it elsewhere in one of the replies to someone else's question: pip install dask-expr==0.5.3 distributed==2024.2.1 #deal with DASK issues due to recent updates to DASK I ran this and then was able to run the standard command rather than the alternative, and it worked. Cheers!

Unfortunately, it didn't work for me or at least it lead to another issue when importing grnboost2:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[1], line 4
      1 import os
      2 import pandas as pd
----> 4 from arboreto.algo import grnboost2
      5 from arboreto.utils import load_tf_names

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/arboreto/algo.py:7
      5 import pandas as pd
      6 from distributed import Client, LocalCluster
----> 7 from arboreto.core import create_graph, SGBM_KWARGS, RF_KWARGS, EARLY_STOP_WINDOW_LENGTH
     10 def grnboost2(expression_data,
     11               gene_names=None,
     12               tf_names='all',
   (...)
     16               seed=None,
     17               verbose=False):
     18     """
     19     Launch arboreto with [GRNBoost2] profile.
     20 
   (...)
     36     :return: a pandas DataFrame['TF', 'target', 'importance'] representing the inferred gene regulatory links.
     37     """

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/arboreto/core.py:12
     10 from sklearn.ensemble import GradientBoostingRegressor, RandomForestRegressor, ExtraTreesRegressor
     11 from dask import delayed
---> 12 from dask.dataframe import from_delayed
     13 from dask.dataframe.utils import make_meta
     15 logger = logging.getLogger(__name__)

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/__init__.py:100
     98 import dask.dataframe._pyarrow_compat
     99 from dask.base import compute
--> 100 from dask.dataframe import backends, dispatch, rolling
    101 from dask.dataframe.core import (
    102     DataFrame,
    103     Index,
   (...)
    109     to_timedelta,
    110 )
    111 from dask.dataframe.groupby import Aggregation

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/backends.py:15
     13 from dask.backends import CreationDispatch, DaskBackendEntrypoint
     14 from dask.dataframe._compat import PANDAS_GE_220, is_any_real_numeric_dtype
---> 15 from dask.dataframe.core import DataFrame, Index, Scalar, Series, _Frame
     16 from dask.dataframe.dispatch import (
     17     categorical_dtype_dispatch,
     18     concat,
   (...)
     35     union_categoricals_dispatch,
     36 )
     37 from dask.dataframe.extensions import make_array_nonempty, make_scalar

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/core.py:36
     34 from dask.blockwise import Blockwise, BlockwiseDep, BlockwiseDepDict, blockwise
     35 from dask.context import globalmethod
---> 36 from dask.dataframe import methods
     37 from dask.dataframe._compat import (
     38     PANDAS_GE_140,
     39     PANDAS_GE_150,
   (...)
     47     is_string_dtype,
     48 )
     49 from dask.dataframe.accessor import CachedAccessor, DatetimeAccessor, StringAccessor

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/methods.py:34
     22 #  preserve compatibility while moving dispatch objects
     23 from dask.dataframe.dispatch import (  # noqa: F401
     24     concat,
     25     concat_dispatch,
   (...)
     32     union_categoricals,
     33 )
---> 34 from dask.dataframe.utils import is_dataframe_like, is_index_like, is_series_like
     35 from dask.utils import _deprecated_kwarg
     37 # cuDF may try to import old dispatch functions

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/utils.py:20
     18 from dask.base import get_scheduler, is_dask_collection
     19 from dask.core import get_deps
---> 20 from dask.dataframe import (  # noqa: F401 register pandas extension types
     21     _dtypes,
     22     methods,
     23 )
     24 from dask.dataframe._compat import PANDAS_GE_150, tm  # noqa: F401
     25 from dask.dataframe.dispatch import (  # noqa : F401
     26     make_meta,
     27     make_meta_obj,
     28     meta_nonempty,
     29 )

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/_dtypes.py:9
      6 import pandas as pd
      8 from dask.dataframe._compat import PANDAS_GE_150
----> 9 from dask.dataframe.extensions import make_array_nonempty, make_scalar
     12 @make_array_nonempty.register(pd.DatetimeTZDtype)
     13 def _(dtype):
     14     return pd.array([pd.Timestamp(1), pd.NaT], dtype=dtype)

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/extensions.py:8
      1 """
      2 Support for pandas ExtensionArray in dask.dataframe.
      3 
      4 See :ref:`extensionarrays` for more.
      5 """
      6 from __future__ import annotations
----> 8 from dask.dataframe.accessor import (
      9     register_dataframe_accessor,
     10     register_index_accessor,
     11     register_series_accessor,
     12 )
     13 from dask.utils import Dispatch
     15 make_array_nonempty = Dispatch("make_array_nonempty")

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/accessor.py:126
    113         token = f"{self._accessor_name}-{attr}"
    114         return self._series.map_partitions(
    115             self._delegate_method,
    116             self._accessor_name,
   (...)
    122             token=token,
    123         )
--> 126 class DatetimeAccessor(Accessor):
    127     """Accessor object for datetimelike properties of the Series values.
    128 
    129     Examples
   (...)
    132     >>> s.dt.microsecond  # doctest: +SKIP
    133     """
    135     _accessor_name = "dt"

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/accessor.py:81, in Accessor.__init_subclass__(cls, **kwargs)
     79 attr, min_version = item if isinstance(item, tuple) else (item, None)
     80 if not hasattr(cls, attr):
---> 81     _bind_property(cls, pd_cls, attr, min_version)

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/dataframe/accessor.py:35, in _bind_property(cls, pd_cls, attr, min_version)
     33 except Exception:
     34     pass
---> 35 setattr(cls, attr, property(derived_from(pd_cls, version=min_version)(func)))

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/utils.py:987, in derived_from.<locals>.wrapper(method)
    985 try:
    986     extra = getattr(method, "__doc__", None) or ""
--> 987     method.__doc__ = _derived_from(
    988         original_klass,
    989         method,
    990         ua_args=ua_args,
    991         extra=extra,
    992         skipblocks=skipblocks,
    993         inconsistencies=inconsistencies,
    994     )
    995     return method
    997 except AttributeError:

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/utils.py:940, in _derived_from(cls, method, ua_args, extra, skipblocks, inconsistencies)
    938 # Mark unsupported arguments
    939 try:
--> 940     method_args = get_named_args(method)
    941     original_args = get_named_args(original_method)
    942     not_supported = [m for m in original_args if m not in method_args]

File ~/anaconda3/envs/arboreto/lib/python3.11/site-packages/dask/utils.py:701, in get_named_args(func)
    699 def get_named_args(func) -> list[str]:
    700     """Get all non ``*args/**kwargs`` arguments for a function"""
--> 701     s = inspect.signature(func)
    702     return [
    703         n
    704         for n, p in s.parameters.items()
    705         if p.kind in [p.POSITIONAL_OR_KEYWORD, p.POSITIONAL_ONLY, p.KEYWORD_ONLY]
    706     ]

File ~/anaconda3/envs/arboreto/lib/python3.11/inspect.py:3263, in signature(obj, follow_wrapped, globals, locals, eval_str)
   3261 def signature(obj, *, follow_wrapped=True, globals=None, locals=None, eval_str=False):
   3262     """Get a signature object for the passed callable."""
-> 3263     return Signature.from_callable(obj, follow_wrapped=follow_wrapped,
   3264                                    globals=globals, locals=locals, eval_str=eval_str)

File ~/anaconda3/envs/arboreto/lib/python3.11/inspect.py:3011, in Signature.from_callable(cls, obj, follow_wrapped, globals, locals, eval_str)
   3007 @classmethod
   3008 def from_callable(cls, obj, *,
   3009                   follow_wrapped=True, globals=None, locals=None, eval_str=False):
   3010     """Constructs Signature for the given callable object."""
-> 3011     return _signature_from_callable(obj, sigcls=cls,
   3012                                     follow_wrapper_chains=follow_wrapped,
   3013                                     globals=globals, locals=locals, eval_str=eval_str)

File ~/anaconda3/envs/arboreto/lib/python3.11/inspect.py:2599, in _signature_from_callable(obj, follow_wrapper_chains, skip_bound_arg, globals, locals, eval_str, sigcls)
   2597     call = getattr_static(type(obj), '__call__', None)
   2598     if call is not None:
-> 2599         call = _descriptor_get(call, obj)
   2600         return _get_signature_of(call)
   2602 raise ValueError('callable {!r} is not supported by signature'.format(obj))

File ~/anaconda3/envs/arboreto/lib/python3.11/inspect.py:2432, in _descriptor_get(descriptor, obj)
   2430 if get is _sentinel:
   2431     return descriptor
-> 2432 return get(descriptor, obj, type(obj))

TypeError: descriptor '__call__' for 'type' objects doesn't apply to a 'property' object

Has anyone experienced this?

hello,I also have this question, do you resolve it?

See above in comments thread :)