earmingol / cell2cell

User-friendly tool to infer cell-cell interactions and communication from gene expression of interacting proteins
BSD 3-Clause "New" or "Revised" License
56 stars 12 forks source link

The way to address TypeError: Passing a set as an indexer is not supported. Use a list instead. #52

Open Pangjing-Wu opened 6 months ago

Pangjing-Wu commented 6 months ago

Hi! I found a type error caused by an invalid data type for Pandas index while running the tutorial code of https://earmingol.github.io/cell2cell/tutorials/Toy-Example-SingleCellPipeline/.

My cell2cell version is 0.7.3 and Pandas version is 2.2.1.

After debugging, I found the type error was caused by the invalid data type for Pandas index on cell2cell/preprocessing/rnaseq.py:179:

def add_complexes_to_expression(rnaseq_data, complexes, agg_method='min'):
    tmp_rna = rnaseq_data.copy()
    for k, v in complexes.items():
        if all(g in tmp_rna.index for g in v):
            df = tmp_rna.loc[v, :]
            if agg_method == 'min':
                tmp_rna.loc[k] = df.min().values.tolist()
            elif agg_method == 'mean':
                tmp_rna.loc[k] = df.mean().values.tolist()
            elif agg_method == 'gmean':
                tmp_rna.loc[k] = df.apply(lambda x: np.exp(np.mean(np.log(x)))).values.tolist()
            else:
                ValueError("{} is not a valid agg_method".format(agg_method))
        else:
            tmp_rna.loc[k] = [0] * tmp_rna.shape[1]
    return tmp_rna

where the type of v is set that does not support by pandas index.

After tracing the source of v, that is the source of argument complexes, I found that in /cell2cell/preprocessing/ppi.py: 370 you conduct several set operations and did not transform back to list or arrayLike data type before finally output complexes. Therefore, the type of complexes.values() is set instead of list, which caused the type error.

[!IMPORTANT] So, I revised the cell2cell/preprocessing/rnaseq.py:179: df = tmp_rna.loc[v, :] to df = tmp_rna.loc[list(v), :]. And now it can work as the expectation.

I do not know whether it is the correct way to fix the error and hope you can conduct a throughout check for this. Thanks!

The following is the complete traceback output FYI:

TypeError                                 Traceback (most recent call last)
Cell In[7], line 1
----> 1 interactions = c2c.analysis.SingleCellInteractions(rnaseq_data=adata.to_df().T,
      2                                                    ppi_data=lr_pairs,
      3                                                    metadata=meta,
      4                                                    interaction_columns=('ligand_symbol', 'receptor_symbol'),
      5                                                    communication_score='expression_thresholding',
      6                                                    expression_threshold=0.1, # values after aggregation
      7                                                    cci_score='bray_curtis',
      8                                                    cci_type='undirected',
      9                                                    aggregation_method='nn_cell_fraction',
     10                                                    barcode_col='index',
     11                                                    celltype_col='cell type',
     12                                                    complex_sep='&',
     13                                                    verbose=False)

File ~/miniconda3/envs/scvi-env2/lib/python3.9/site-packages/cell2cell/analysis/cell2cell_pipelines.py:693, in SingleCellInteractions.__init__(self, rnaseq_data, ppi_data, metadata, interaction_columns, communication_score, cci_score, cci_type, expression_threshold, aggregation_method, barcode_col, celltype_col, complex_sep, complex_agg_method, verbose)
    685 self.aggregated_expression = rnaseq.aggregate_single_cells(rnaseq_data=self.rnaseq_data,
    686                                                            metadata=self.metadata,
    687                                                            barcode_col=self.index_col,
    688                                                            celltype_col=self.group_col,
    689                                                            method=self.aggregation_method,
    690                                                            transposed=self.__adata)
    692 # Interaction Space
--> 693 self.interaction_space = initialize_interaction_space(rnaseq_data=self.aggregated_expression,
    694                                                       ppi_data=self.ppi_data,
    695                                                       cutoff_setup=self.cutoff_setup,
    696                                                       analysis_setup=self.analysis_setup,
    697                                                       complex_sep=self.complex_sep,
    698                                                       complex_agg_method=self.complex_agg_method,
    699                                                       interaction_columns=self.interaction_columns,
    700                                                       verbose=verbose)

File ~/miniconda3/envs/scvi-env2/lib/python3.9/site-packages/cell2cell/analysis/cell2cell_pipelines.py:940, in initialize_interaction_space(rnaseq_data, ppi_data, cutoff_setup, analysis_setup, excluded_cells, complex_sep, complex_agg_method, interaction_columns, verbose)
    936     excluded_cells = []
    938 included_cells = sorted(list((set(rnaseq_data.columns) - set(excluded_cells))))
--> 940 interaction_space = ispace.InteractionSpace(rnaseq_data=rnaseq_data[included_cells],
    941                                             ppi_data=ppi_data,
    942                                             gene_cutoffs=cutoff_setup,
    943                                             communication_score=analysis_setup['communication_score'],
    944                                             cci_score=analysis_setup['cci_score'],
    945                                             cci_type=analysis_setup['cci_type'],
    946                                             complex_sep=complex_sep,
    947                                             complex_agg_method=complex_agg_method,
    948                                             interaction_columns=interaction_columns,
    949                                             verbose=verbose)
    950 return interaction_space

File ~/miniconda3/envs/scvi-env2/lib/python3.9/site-packages/cell2cell/core/interaction_space.py:381, in InteractionSpace.__init__(self, rnaseq_data, ppi_data, gene_cutoffs, communication_score, cci_score, cci_type, cci_matrix_template, complex_sep, complex_agg_method, interaction_columns, verbose)
    374     self.ppi_data = self.ppi_data.assign(score=1.0)
    376 self.modified_rnaseq = integrate_data.get_modified_rnaseq(rnaseq_data=rnaseq_data,
    377                                                           cutoffs=cutoff_values,
    378                                                           communication_score=self.communication_score,
    379                                                           )
--> 381 self.interaction_elements = generate_interaction_elements(modified_rnaseq=self.modified_rnaseq,
    382                                                           ppi_data=self.ppi_data,
    383                                                           cci_matrix_template=cci_matrix_template,
    384                                                           cci_type=self.cci_type,
    385                                                           complex_sep=complex_sep,
    386                                                           complex_agg_method=complex_agg_method,
    387                                                           verbose=verbose)
    389 self.interaction_elements['ppi_score'] = self.ppi_data['score'].values

File ~/miniconda3/envs/scvi-env2/lib/python3.9/site-packages/cell2cell/core/interaction_space.py:146, in generate_interaction_elements(modified_rnaseq, ppi_data, cci_type, cci_matrix_template, complex_sep, complex_agg_method, interaction_columns, verbose)
    141 if complex_sep is not None:
    142     col_a_genes, complex_a, col_b_genes, complex_b, complexes = get_genes_from_complexes(ppi_data=ppi_data,
    143                                                                                          complex_sep=complex_sep,
    144                                                                                          interaction_columns=interaction_columns
    145                                                                                          )
--> 146     modified_rnaseq = add_complexes_to_expression(rnaseq_data=modified_rnaseq,
    147                                                   complexes=complexes,
    148                                                   agg_method=complex_agg_method
    149                                                   )
    151 # Cells
    152 cell_instances = list(modified_rnaseq.columns)  # @Erick, check if position 0 of columns contain index header.

File ~/miniconda3/envs/scvi-env2/lib/python3.9/site-packages/cell2cell/preprocessing/rnaseq.py:179, in add_complexes_to_expression(rnaseq_data, complexes, agg_method)
    177 for k, v in complexes.items():
    178     if all(g in tmp_rna.index for g in v):
--> 179         df = tmp_rna.loc[v, :]
    180         if agg_method == 'min':
    181             tmp_rna.loc[k] = df.min().values.tolist()

File ~/miniconda3/envs/scvi-env2/lib/python3.9/site-packages/pandas/core/indexing.py:1178, in _LocationIndexer.__getitem__(self, key)
   1176 @final
   1177 def __getitem__(self, key):
-> 1178     check_dict_or_set_indexers(key)
   1179     if type(key) is tuple:
   1180         key = tuple(list(x) if is_iterator(x) else x for x in key)

File ~/miniconda3/envs/scvi-env2/lib/python3.9/site-packages/pandas/core/indexing.py:2774, in check_dict_or_set_indexers(key)
   2766 \"\"\"
   2767 Check if the indexer is or contains a dict or set, which is no longer allowed.
   2768 \"\"\"
   2769 if (
   2770     isinstance(key, set)
   2771     or isinstance(key, tuple)
   2772     and any(isinstance(x, set) for x in key)
   2773 ):
-> 2774     raise TypeError(
   2775         \"Passing a set as an indexer is not supported. Use a list instead.\"
   2776     )
   2778 if (
   2779     isinstance(key, dict)
   2780     or isinstance(key, tuple)
   2781     and any(isinstance(x, dict) for x in key)
   2782 ):
   2783     raise TypeError(
   2784         \"Passing a dict as an indexer is not supported. Use a list instead.\"
   2785     )

TypeError: Passing a set as an indexer is not supported. Use a list instead."
earmingol commented 6 months ago

Hi @Pangjing-Wu

Thank you so much for the detailed report. I had tracked the issue before, but I forgot to implement the quick fix.

This was caused because of newer versions of pandas that stopped using sets for indexing dataframes. I just published a newer version fixing this issue. Please update to v0.7.4 with pip install -U cell2cell.

Let me know if this works properly.