SydneyBioX / BIDCell

Biologically-informed deep learning for cell segmentation of subcelluar spatial transcriptomics data
Other
35 stars 5 forks source link

semgenting_dapi using cellpose 'Nuclei' setting #10

Closed Ceyhun-Alar closed 5 months ago

Ceyhun-Alar commented 9 months ago

Hi is there a reason why you use model_type 'cyto' for segmenting nuclei in dapi images? image

And also when changing the model_type to 'nuclei' it errors at model.train with:

2024-02-19 10:21:22,440 INFO Initialising model Number of genes: 117 2024-02-19 10:21:23,135 INFO Preparing data Loaded nuclei (5797, 3642) 4367 patches available Cell types: [''CT1","CT2","CT3","CT4","CT5"] 2024-02-19 10:21:23,521 INFO Total number of training examples: 4367 2024-02-19 10:21:23,524 INFO Begin training

Epoch = 1 lr = 1e-05 2024-02-19 10:21:23,740 INFO Model saved: /mnt/data/tool/vangnet/R1/model_outputs/2024_02_19_10_21_22/models/epoch_1_step_0.pth

ValueError Traceback (most recent call last) Cell In[9], line 1 ----> 1 model.train()

File /mnt/data/miniconda3/envs/bidcell/lib/python3.10/site-packages/bidcell/BIDCellModel.py:114, in BIDCellModel.train(self) 111 def train(self) -> None: 112 """Train the model. 113 """ --> 114 train(self.config)

File /mnt/data/miniconda3/envs/bidcell/lib/python3.10/site-packages/bidcell/model/train.py:170, in train(config) 167 cur_lr = optimizer.param_groups[0]["lr"] 168 print("\nEpoch =", (epoch + 1), " lr =", cur_lr) --> 170 for step_epoch, ( 171 batch_x313, 172 batch_n, 173 batch_sa, 174 batch_pos, 175 batch_neg, 176 coords_h1, 177 coords_w1, 178 nucl_aug, 179 expr_aug_sum, 180 ) in enumerate(train_loader): 181 # Permute channels axis to batch axis 182 # torch.Size([1, patch_size, patch_size, 313, n_cells]) to [n_cells, 313, patch_size, patch_size] 183 batch_x313 = batch_x313[0, :, :, :, :].permute(3, 2, 0, 1) 184 batch_sa = batch_sa.permute(3, 0, 1, 2)

File /mnt/data/miniconda3/envs/bidcell/lib/python3.10/site-packages/torch/utils/data/dataloader.py:630, in _BaseDataLoaderIter.next(self) 627 if self._sampler_iter is None: 628 # TODO(https://github.com/pytorch/pytorch/issues/76750) 629 self._reset() # type: ignore[call-arg] --> 630 data = self._next_data() 631 self._num_yielded += 1 632 if self._dataset_kind == _DatasetKind.Iterable and \ 633 self._IterableDataset_len_called is not None and \ 634 self._num_yielded > self._IterableDataset_len_called:

File /mnt/data/miniconda3/envs/bidcell/lib/python3.10/site-packages/torch/utils/data/dataloader.py:674, in _SingleProcessDataLoaderIter._next_data(self) 672 def _next_data(self): 673 index = self._next_index() # may raise StopIteration --> 674 data = self._dataset_fetcher.fetch(index) # may raise StopIteration 675 if self._pin_memory: 676 data = _utils.pin_memory.pin_memory(data, self._pin_memory_device)

File /mnt/data/miniconda3/envs/bidcell/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py:51, in _MapDatasetFetcher.fetch(self, possibly_batched_index) 49 data = self.dataset.getitems(possibly_batched_index) 50 else: ---> 51 data = [self.dataset[idx] for idx in possibly_batched_index] 52 else: 53 data = self.dataset[possibly_batched_index]

File /mnt/data/miniconda3/envs/bidcell/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py:51, in (.0) 49 data = self.dataset.getitems(possibly_batched_index) 50 else: ---> 51 data = [self.dataset[idx] for idx in possibly_batched_index] 52 else: 53 data = self.dataset[possibly_batched_index]

File /mnt/data/miniconda3/envs/bidcell/lib/python3.10/site-packages/bidcell/model/dataio/dataset_input.py:297, in DataProcessing.getitem(self, index) 292 except Exception: 293 search_areas[:, :, i_cell] = cv2.dilate( 294 nucl_split[:, :, i_cell], kernel, iterations=1 295 ) --> 297 ct_nucleus = int(self.nuclei_types_idx[self.nuclei_types_ids.index(c_id)]) 298 ct_nucleus_name = self.type_names[ct_nucleus] 300 # Markers with dilation 301 # ct_pos = np.expand_dims(np.expand_dims(self.pos_markers[ct_nucleus,:], 0),0)*expr_aug

ValueError: 10396.0 is not in list

xhelenfu commented 8 months ago

Hi, we used the cyto model because it gave better nuclei on the data we used.

In terms of the error, it seems like it is trying to find cell type 10396, which is a very high number.