nadeemlab / SPT

Spatial profiling toolbox for spatial characterization of tumor immune microenvironment in multiplex images
https://oncopathtk.org
Other
21 stars 2 forks source link

Bug in squidpy metrics computation #197

Closed jimmymathews closed 11 months ago

jimmymathews commented 1 year ago

There is a possibility that the number of "clusters" in the AnnData object we create and pass to squidpy functions is sometimes 1 and not 2, as expected. Here is a relevant error log:

08-24 00:28:59 [  INFO   ] spt ondemand start: Request: b'neighborhood enrichment\x1dMelanoma CyTOF ICI - measurement\x1dCD8A\x1eCD3\x1eCD45RA\x1d\x1dSOX10\x1dCD3\x1eMS4A1\x1ePECAM1\x1ePTPRC'
08-24 00:28:59 [  INFO   ] spt ondemand start: Request: b'neighborhood enrichment\x1dMelanoma CyTOF ICI - measurement\x1dCD8A\x1eCD3\x1eCD45RA\x1d\x1dSOX10\x1dCD3\x1eMS4A1\x1ePECAM1\x1ePTPRC'
08-24 00:28:59 [  DEBUG  ] spt ondemand start:149: ['Melanoma CyTOF ICI - measurement', 'CD8A\x1eCD3\x1eCD45RA', '', 'SOX10', 'CD3\x1eMS4A1\x1ePECAM1\x1ePTPRC']
08-24 00:28:59 [  DEBUG  ] spt ondemand start:149: ['Melanoma CyTOF ICI - measurement', 'CD8A\x1eCD3\x1eCD45RA', '', 'SOX10', 'CD3\x1eMS4A1\x1ePECAM1\x1ePTPRC']
08-24 00:28:59 [  DEBUG  ] ondemand.providers.pending_provider:27: Requesting computation.
08-24 00:28:59 [  DEBUG  ] ondemand.providers.squidpy_provider:64: Creating feature with specifiers: (Melanoma CyTOF ICI - measurement) ["(('CD8A', 'CD3', 'CD45RA'), ())", "(('SOX10',), ('CD3', 'MS4A1', 'PECAM1', 'PTPRC'))"]
08-24 00:28:59 [  DEBUG  ] workflow.common.export_features:302: Inserting specification 276, data_analysis_study Melanoma CyTOF ICI - ondemand computed features
08-24 00:28:59 [  DEBUG  ] workflow.common.export_features:316: Inserting specifier: ('276', "(('CD8A', 'CD3', 'CD45RA'), ())", '1')
08-24 00:28:59 [  DEBUG  ] workflow.common.export_features:316: Inserting specifier: ('276', "(('SOX10',), ('CD3', 'MS4A1', 'PECAM1', 'PTPRC'))", '2')
08-24 00:28:59 [  DEBUG  ] ondemand.providers.pending_provider:85: Number of values possible to be computed: 72
08-24 00:28:59 [  DEBUG  ] ondemand.providers.pending_provider:111: Actual number computed: 0
08-24 00:28:59 [  DEBUG  ] ondemand.providers.pending_provider:37: Not already pending.
08-24 00:28:59 [  DEBUG  ] ondemand.providers.pending_provider:39: Starting background task.
08-24 00:28:59 [  DEBUG  ] ondemand.providers.pending_provider:42: Background task just started, is pending.
08-24 00:29:00 [  DEBUG  ] ondemand.providers.squidpy_provider:158: Computed feature value of 276: Mold_14_0, 0.9999999966019605
08-24 00:29:01 [  DEBUG  ] ondemand.providers.squidpy_provider:158: Computed feature value of 276: Mold_62_0, 0.9999999999983203
/usr/local/lib/python3.11/site-packages/squidpy/gr/_nhood.py:188: RuntimeWarning: invalid value encountered in divide
  zscore = (count - perms.mean(axis=0)) / perms.std(axis=0)
08-24 00:29:02 [  DEBUG  ] ondemand.providers.squidpy_provider:158: Computed feature value of 276: Mold_48_0, 0.9997325365091907
08-24 00:29:02 [  DEBUG  ] ondemand.providers.squidpy_provider:158: Computed feature value of 276: Mold_67_0, 0.9999999965373167
08-24 00:29:02 [  DEBUG  ] workflow.common.cell_df_indexer:36: Some KeyError. (1, 1, 1)
08-24 00:29:02 [  DEBUG  ] workflow.common.cell_df_indexer:36: Some KeyError. (1, 0, 0, 0, 0)
Exception in thread Thread-16 (have_feature_computed):
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/threading.py", line 1038, in _bootstrap_inner
    self.run()
  File "/usr/local/lib/python3.11/threading.py", line 975, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.11/site-packages/spatialprofilingtoolbox/ondemand/providers/squidpy_provider.py", line 151, in have_feature_computed
    value = compute_squidpy_metric_for_one_sample(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/spatialprofilingtoolbox/workflow/common/squidpy.py", line 45, in compute_squidpy_metric_for_one_sample
    return _summarize_neighborhood_enrichment(_nhood_enrichment(adata))
                                              ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/spatialprofilingtoolbox/workflow/common/squidpy.py", line 133, in _nhood_enrichment
    result = nhood_enrichment(adata, 'cluster', copy=True, seed=128, show_progress_bar=False)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/squidpy/gr/_nhood.py", line 174, in nhood_enrichment
    _test = _create_function(n_cls, parallel=numba_parallel)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/squidpy/gr/_nhood.py", line 86, in _create_function
    raise ValueError(f"Expected at least `2` clusters, found `{n_cls}`.")
ValueError: Expected at least `2` clusters, found `1`.

The usage that triggered this was a request for the neighborhood enrichment metric with a certain pair of phenotypes on the Moldoveanu dataset. Here is the HTTP request:

https://<apiserver>/request-spatial-metrics-computation-custom-phenotypes/?study=Melanoma%20CyTOF%20ICI&feature_class=neighborhood%20enrichment&positive_marker=CD8A&positive_marker=CD3&positive_marker=CD45RA&negative_marker=&positive_marker2=SOX10&negative_marker2=CD3&negative_marker2=MS4A1&negative_marker2=PECAM1&negative_marker2=PTPRC
CarlinLiao commented 1 year ago

This will happen if the both the phenotypes you pass have identical values for all cells in the sample. I've updated squidpy.py to account for this by raising a Python warning when only once cluster could be made and returning None when co_occurence errors because of this.

https://github.com/nadeemlab/SPT/blob/06a0f8528b9b82bf6d4ba42d7df477fe6c63d8de/spatialprofilingtoolbox/db/squidpy_metrics.py#L50-L61

Note that the db function skips uploading a record when no co_occurence value is returned. We should consider replacing it with something like a NaN value to indicate that this value cannot be computed. Either way, this will need to be handled on the frontend.

jimmymathews commented 1 year ago

This bug was happening with neighborhood enrichment, not co-occurrence, and it happened when 2 distinct phenotypes were used.

jimmymathews commented 1 year ago

Also I would like to check that the production instance no longer exhibits this issue before we close.

jimmymathews commented 11 months ago

This was part way resolved by changes made since the original issue (the warning does appear) , but the current behavior is still pretty much the same. If this 1-cluster issue is encountered during computation of a feature, the error causes no further computations to proceed, and the feature is permanently in a pending computation state.

jimmymathews commented 11 months ago

Fixed by issue197 71584726daf33e00956e2a37e911eb2be8febe25.

Now when this issue is encountered it is logged, None is returned, and the computation proceeds to the next sample.

...
11-16 21:52:00 [  DEBUG  ] ondemand.providers.squidpy_provider:172: Computed feature value of 2: Mold_61_0, 0.08425675935957183
11-16 21:52:01 [  DEBUG  ] ondemand.providers.squidpy_provider:172: Computed feature value of 2: Mold_62_0, 0.9999999999983203
11-16 21:52:01 [  DEBUG  ] ondemand.providers.squidpy_provider:172: Computed feature value of 2: Mold_63_0, 1.9636129412073801e-90
/usr/local/lib/python3.11/site-packages/spatialprofilingtoolbox/workflow/common/squidpy.py:158: UserWarning: All phenotypes provided had identical values. Only one cluster could be made.
  warn('All phenotypes provided had identical values. Only one cluster could be made.')
11-16 21:52:01 [  ERROR  ] workflow.common.squidpy:57: Got 1 cluster, need 2 to compute neighborhood enrichment. Presuming null.
11-16 21:52:01 [  DEBUG  ] ondemand.providers.squidpy_provider:172: Computed feature value of 2: Mold_64_0, None
11-16 21:52:01 [  DEBUG  ] ondemand.providers.squidpy_provider:172: Computed feature value of 2: Mold_65_0, 0.9993631557868359
11-16 21:52:01 [  DEBUG  ] ondemand.providers.squidpy_provider:172: Computed feature value of 2: Mold_66_0, 0.9282621463508228
...