morris-lab / CellOracle

This is the alpha version of the CellOracle package
Other
302 stars 53 forks source link

TypeError: BaggingRegressor 'base_estimator' #178

Closed iichelhadi closed 8 months ago

iichelhadi commented 8 months ago

Hello, I get the following error when I run oracle.get_links. I hope I can get some help with this

Regards

%%time
# Calculate GRN for each population in "louvain_annot" clustering unit.
# This step may take long time.
links = oracle.get_links(cluster_name_for_GRN_unit="clusters", alpha=10,
                         verbose_level=10, test_mode=False, n_jobs=5)
  0%|          | 0/13 [00:00<?, ?it/s]

Inferring GRN for 1...

  0%|          | 0/2931 [00:00<?, ?it/s]

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
File <timed exec>:3

File ~/miniconda3/envs/bio/lib/python3.10/site-packages/celloracle/trajectory/oracle_core.py:1467, in Oracle.get_links(self, cluster_name_for_GRN_unit, alpha, bagging_number, verbose_level, test_mode, model_method, ignore_warning, n_jobs)
   1463     if info["n_target_genes_both_TFdict_and_scRNA-seq"] == '0 genes':
   1464         raise ValueError("Found No overlap between TF info (base GRN) and your scRNA-seq data. Please check your data format and species.")
-> 1467 links = get_links(oracle_object=self,
   1468                   cluster_name_for_GRN_unit=cluster_name_for_GRN_unit,
   1469                   alpha=alpha, bagging_number=bagging_number,
   1470                   verbose_level=verbose_level, test_mode=test_mode,
   1471                   model_method=model_method,
   1472                   n_jobs=n_jobs)
   1473 return links

File ~/miniconda3/envs/bio/lib/python3.10/site-packages/celloracle/network_analysis/network_construction.py:74, in get_links(oracle_object, cluster_name_for_GRN_unit, alpha, bagging_number, verbose_level, test_mode, model_method, n_jobs)
     71     cluster_name_for_GRN_unit = oracle_object.cluster_column_name
     73 # calculate GRN for each cluster
---> 74 linkLists = _fit_GRN_for_network_analysis(oracle_object, cluster_name_for_GRN_unit=cluster_name_for_GRN_unit,
     75                               alpha=alpha, bagging_number=bagging_number,  verbose_level=verbose_level, test_mode=test_mode,
     76                               model_method=model_method, n_jobs=n_jobs)
     78 # initiate links object
     79 links = Links(name=cluster_name_for_GRN_unit,
     80              links_dict=linkLists)

File ~/miniconda3/envs/bio/lib/python3.10/site-packages/celloracle/network_analysis/network_construction.py:138, in _fit_GRN_for_network_analysis(oracle_object, cluster_name_for_GRN_unit, alpha, bagging_number, verbose_level, test_mode, model_method, n_jobs)
    131 gem_std = gem_imputed_std[cells_in_the_cluster_bool]
    134 tn_ = Net(gene_expression_matrix=gem_,
    135              gem_standerdized=gem_std,
    136              TFinfo_dic=oracle_object.TFdict,
    137              verbose=False)
--> 138 tn_.fit_All_genes(bagging_number=bagging_number,
    139                   model_method=model_method,
    140                   alpha=alpha,
    141                   verbose=verbose,
    142                   n_jobs=n_jobs)
    145 #oracle_object.linkMat[cluster] = tn_.returnResultAs_TGxTFs("coef_abs")
    146 tn_.updateLinkList(verbose=False)

File ~/miniconda3/envs/bio/lib/python3.10/site-packages/celloracle/network/net_core.py:312, in Net.fit_All_genes(self, bagging_number, scaling, model_method, command_line_mode, log, alpha, verbose, n_jobs)
    295 def fit_All_genes(self, bagging_number=200, scaling=True, model_method="bagging_ridge",
    296                   command_line_mode=False, log=None, alpha=1, verbose=True, n_jobs=-1):
    297     """
    298     Make ML models for all genes.
    299     The calculation will be performed in parallel using scikit-learn bagging function.
   (...)
    310         n_jobs (int): Number of cpu cores for parallel calculation. -1 means using all available cores.
    311     """
--> 312     self.fit_genes(target_genes=self.all_genes,
    313                    bagging_number=bagging_number,
    314                    scaling=scaling,
    315                    model_method=model_method,
    316                    save_coefs=False,
    317                    command_line_mode=command_line_mode,
    318                    log=log,
    319                    alpha=alpha,
    320                    verbose=verbose,
    321                    n_jobs=n_jobs)

File ~/miniconda3/envs/bio/lib/python3.10/site-packages/celloracle/network/net_core.py:422, in Net.fit_genes(self, target_genes, bagging_number, scaling, model_method, save_coefs, command_line_mode, log, alpha, verbose, n_jobs)
    419     loop = genes
    421 for target_gene in loop:
--> 422     coefs = _get_bagging_ridge_coefs(target_gene=target_gene,
    423                                      gem=self.gem,
    424                                      gem_scaled=self.gem_standerdized,
    425                                      TFdict=self.TFdict,
    426                                      cellstate=self.cellstate,
    427                                      bagging_number=bagging_number,
    428                                      scaling=scaling,
    429                                      n_jobs=n_jobs,
    430                                      alpha=alpha,
    431                                      solver=RIDGE_SOLVER)
    433     if isinstance(coefs, int):
    434         self.failed_genes.append(target_gene)

File ~/miniconda3/envs/bio/lib/python3.10/site-packages/celloracle/network/regression_models.py:118, in get_bagging_ridge_coefs(target_gene, gem, gem_scaled, TFdict, cellstate, bagging_number, scaling, n_jobs, alpha, solver)
    114 label = gem[target_gene]
    116 #print(n_jobs)
    117 # bagging model
--> 118 model = BaggingRegressor(base_estimator=Ridge(alpha=alpha,
    119                                               solver=solver,
    120                                               random_state=123),
    121                          n_estimators=bagging_number,
    122                          bootstrap=True,
    123                          max_features=0.8,
    124                          n_jobs=n_jobs,
    125                          verbose=False,
    126                          random_state=123)
    127 model.fit(data, label)
    129 # get results

TypeError: BaggingRegressor.__init__() got an unexpected keyword argument 'base_estimator'
iichelhadi commented 8 months ago

just as an fyi co.check_python_requirements()

package_name    installed_version   required_version    requirement_satisfied
0   numpy   1.26.3  auto    True
1   scipy   1.11.4  auto    True
2   cython  3.0.8   auto    True
3   numba   0.58.1  0.50.1  True
4   matplotlib  3.6.3   auto    True
5   seaborn     0.13.1  auto    True
6   scikit-learn    1.4.0   auto    True
7   h5py    3.10.0  3.1.0   True
8   pandas  1.5.3   1.0.3   True
9   velocyto    0.17.17     0.17    True
10  umap-learn  0.5.5   auto    True
11  pyarrow     14.0.2  0.17    True
12  tqdm    4.66.1  4.45    True
13  igraph  0.10.8  0.10.1  True
14  louvain     0.8.1   auto    True
15  jupyter     1.0.0   auto    True
16  anndata     0.10.4  0.7.5   True
17  scanpy  1.9.6   1.6     True
18  joblib  1.3.2   auto    True
19  goatools    1.3.11  auto    True
20  genomepy    0.16.1  0.8.4   True
21  gimmemotifs     0.17.0  0.14.4  True
iichelhadi commented 8 months ago

I found the error

New in version 1.2: base_estimator was renamed to estimator. https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.BaggingRegressor.html

varsha090597 commented 6 months ago

Hi! This is still an issue in the most recent release. The code still has to be changed in regression_models.py to fix the issue.

KenjiKamimoto-ac commented 6 months ago

Hi @iichelhadi @varsha090597

Thank you for reporting the issue. I fixed the issue. It should be fine in the current version, celloracle>=0.17.0