angelolab / ark-analysis

Integrated pipeline for multiplexed image analysis
https://ark-analysis.readthedocs.io/en/latest/
MIT License
74 stars 26 forks source link

Lists cell counts cluster ids and pixel channel cluster ids ordered differently #986

Closed zsfrbkv closed 1 year ago

zsfrbkv commented 1 year ago

Describe the bug When running Pixie: cell clustering notebook, the weighted cell channel expression file cannot be generated.

Expected behavior The code runs without a problem.

To Reproduce

# depending on which pixel_cluster_col is selected, choose the pixel channel average table accordingly
if pixel_cluster_col == 'pixel_som_cluster':
    pc_chan_avg_name = pc_chan_avg_som_cluster_name
elif pixel_cluster_col == 'pixel_meta_cluster_rename': # this one is set
    pc_chan_avg_name = pc_chan_avg_meta_cluster_name

# generate the weighted cell channel expression data
pixel_channel_avg = pd.read_csv(os.path.join(base_dir, pc_chan_avg_name))
weighted_cell_channel = weighted_channel_comp.compute_p2c_weighted_channel_avg(
    pixel_channel_avg,
    channels,
    cluster_counts,
    fovs=fovs,
    pixel_cluster_col=pixel_cluster_col
)

# write the data to weighted_cell_channel_name
feather.write_dataframe(
    weighted_cell_channel,
    os.path.join(base_dir, weighted_cell_channel_name),
    compression='uncompressed'
)

results in:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[25], line 9
      7 # generate the weighted cell channel expression data
      8 pixel_channel_avg = pd.read_csv(os.path.join(base_dir, pc_chan_avg_name))
----> 9 weighted_cell_channel = weighted_channel_comp.compute_p2c_weighted_channel_avg(
     10     pixel_channel_avg,
     11     channels,
     12     cluster_counts,
     13     fovs=fovs,
     14     pixel_cluster_col=pixel_cluster_col
     15 )
     17 # write the data to weighted_cell_channel_name
     18 feather.write_dataframe(
     19     weighted_cell_channel,
     20     os.path.join(base_dir, weighted_cell_channel_name),
     21     compression='uncompressed'
     22 )

File /omics/groups/OE0606/internal/zaira/miniconda3/envs/ark/lib/python3.8/site-packages/ark/phenotyping/weighted_channel_comp.py:101, in compute_p2c_weighted_channel_avg(pixel_channel_avg, channels, cell_counts, fovs, pixel_cluster_col)
     97 pixel_channel_cluster_ids = pixel_channel_avg_sorted[pixel_cluster_col].values
     99 # extra sanity checking, the matrix multiplication will fail otherwise
    100 # this should never fail, just as an added protection
--> 101 misc_utils.verify_same_elements(
    102     enforce_order=True,
    103     cell_counts_cluster_ids=cell_counts_cluster_ids,
    104     pixel_channel_cluster_ids=pixel_channel_cluster_ids
    105 )
    107 # assert that the channel subset provided is valid
    108 # this should never fail, just as an added protection
    109 misc_utils.verify_in_list(
    110     provided_channels=channels,
    111     pixel_channel_avg_cols=pixel_channel_avg_sorted.columns.values
    112 )

File /omics/groups/OE0606/internal/zaira/miniconda3/envs/ark/lib/python3.8/site-packages/alpineer/misc_utils.py:222, in verify_same_elements(enforce_order, warn, **kwargs)
    211     warnings.warn(
    212         err_str
    213         % (
   (...)
    219         )
    220     )
    221 else:
--> 222     raise ValueError(
    223         err_str
    224         % (
    225             list_one_name,
    226             list_two_name,
    227             list_one_cast[first_bad_index],
    228             list_two_cast[first_bad_index],
    229             first_bad_index,
    230         )
    231     )

ValueError: Lists cell counts cluster ids and pixel channel cluster ids ordered differently: values CD3+ and CD20+ do not match at index 1
cliu72 commented 1 year ago

Hello! Someone in our lab had run into this same error before and it was because there were 2 pixel clusters with the same name. Can you check if all of your pixel cluster names are unique values?

zsfrbkv commented 1 year ago

Hello! Someone in our lab had run into this same error before, and it was because there were 2 pixel clusters with the same name. Can you check if all of your pixel cluster names are unique values?

Hi! Sorry for the late reply, but you are right; I had 2 pixel clusters having the same name. Could you add it to the code to produce a more helpful error message?

cliu72 commented 1 year ago

Hello! Someone in our lab had run into this same error before, and it was because there were 2 pixel clusters with the same name. Can you check if all of your pixel cluster names are unique values?

Hi! Sorry for the late reply, but you are right; I had 2 pixel clusters having the same name. Could you add it to the code to produce a more helpful error message?

Yes, it is already an open issue: https://github.com/angelolab/ark-analysis/issues/972