Cannot reproduce results as per article

Mgryn commented 1 year ago

Hi @dakomura, I would like to obtain similar results to the ones presented in the article, but I have problems with doing so. For example, my current best val_dice_epoch value for plasma cells is 0.27, which is much lower than the result from testing stage in the article. I would like to ask for help as maybe I am doing something wrong. I would appreciate it if you could answer some of my questions:

I am working on the data downloaded from zenodo. Do I understand correctly, that when running 6.train_segmentation_models is should just use the HE images and masks as they are, or should I preprocess them ?
Is there a possibility you could share with me more information about parameters of best performing models for each cell types: plasma, myeloid, leukocytes, lymphocytes and endothelial? The supplementary information only mentions model architectures and decoders.
Could you share the code used for testing stage ?
Could you tell me which versions of Pytorch Lightning, kornia and torchmetrics do you have in your environment? After cloning the repository I was not able to run it with PL 1.7.4 and kornia 0.6.7 due to stochastic_weight_avg being removed from Trainer constructor, problems with PyTorchLightningPruningCallback and kornia.augmentation not having GaussianBlur. I fixed the errors, but I noticed that in my version of torchmetrics (0.9.3) ignore_index parameter is not used.
Isn't ignore_index parameter unnecessary when passed to merged_Intersection and merged_Union ? Both _intersection_from_confmat and _union_from_confmat do not take it as an argument, and based on union calculation, confmat[0, 1] is actually necessary.

Thank you in advance for help in obtaining better results.

dakomura commented 1 year ago

Hi @Mgryn

Regarding the low Dice score for plasma cells, we are experiencing similar values on our end. The Dice coefficient in Figure 6 of the paper is calculated at the object level, not the pixel level. This means that even a single pixel overlap of contiguous regions is considered correct, hence the higher score. We chose this approach because it would be too cumbersome for pathologists to select nuclei at the pixel level. For pixel-level Dice scores, see Fig. 5, where the score is close to 0.27 (for test data).

Here are the answers to your questions.

No preprocessing is required.
We have attached the hyperparameters used to train the rank 1 model in Table S5 into a file (rank1_params.xlsx).
We plan to release this in the near future.
The versions we used are Pytorch Lightning (PL): 1.4.2, Kornia: 0.5.1, and TorchMetrics: 0.5.0.
Upon verification, you are correct. The ignore_index was unnecessary.

Mgryn commented 1 year ago

@dakomura Thank you very much for your answers and providing the hyperparameters.

dakomura / SegPath_code

Cannot reproduce results as per article #1