angelolab / ark-analysis

Integrated pipeline for multiplexed image analysis
https://ark-analysis.readthedocs.io/en/latest/
MIT License
72 stars 26 forks source link

3_Pixie_Cluster_Cells notebook error. Length mismatch: Expected 5101 rows, received array of length 9931 #1136

Closed fotisn7 closed 5 months ago

fotisn7 commented 5 months ago

Describe the bug When generating input data for cell SOM the following error occurs:

ValueError Traceback (most recent call last) /var/folders/tw/g5nc7jps36b5pnmnc7rrbxrh0000gn/T/ipykernel_34504/3981003560.py in ?() 3 cluster_counts = feather.read_dataframe(os.path.join(base_dir, cluster_counts_name)) 4 cluster_counts_size_norm = feather.read_dataframe(os.path.join(base_dir, cluster_counts_size_norm_name)) 5 else: 6 # generate the preprocessed data ----> 7 cluster_counts, cluster_counts_size_norm = cell_cluster_utils.create_c2pc_data( 8 fovs, os.path.join(base_dir, pixel_data_dir), cell_table_path, pixel_cluster_col 9 ) 10

~/Documents/ark-analysis/src/ark/phenotyping/cell_cluster_utils.py in ?(fovs, pixel_data_path, cell_table_path, pixel_cluster_col) 157 ].index.values 158 ) 159 160 # combine the data of num_cluster_per_seg_label into cell_table_indices --> 161 num_cluster_per_seg_label = num_cluster_per_seg_label.set_index(cell_table_indices) 162 cell_table = cell_table.combine_first(num_cluster_per_seg_label) 163 164 # NaN means the cluster wasn't found in the specified fov-cell pair

/opt/anaconda3/envs/ark_env/lib/python3.11/site-packages/pandas/core/frame.py in ?(self, keys, drop, append, inplace, verify_integrity) 6169 6170 if len(arrays[-1]) != len(self): 6171 # check newest element against length of calling frame, since 6172 # ensure_index_from_sequences would not raise for append=False. -> 6173 raise ValueError( 6174 f"Length mismatch: Expected {len(self)} rows, " 6175 f"received array of length {len(arrays[-1])}" 6176 )

ValueError: Length mismatch: Expected 5101 rows, received array of length 9931

Expected behavior Generate input data for cell SOM

To Reproduce Using example data, run 3_Pixie_Cluster_Cells notebook

alex-l-kong commented 5 months ago

@fotisn7 we've seen this issue pop up before when users have duplicate cell labels assigned within a FOV in the cell table. Can you check and make sure this is not the case for yours?

fotisn7 commented 5 months ago

@alex-l-kong you're right. Cell table contained both whole_cell and nuclei information with the same cell labels for each FOV. After removing the nuclei information, it worked. Thank you!