Inference on Cytodark Dataset: Cell Mismatch Issue with Default Parameters

digitalpathologybern / hover_next_inference

Inference code for HoVer-NeXt

GNU General Public License v3.0

22 stars 6 forks source link

Inference on Cytodark Dataset: Cell Mismatch Issue with Default Parameters #14

Open shubhankar-git opened 1 week ago

shubhankar-git commented 1 week ago

Original image ID10_Aud_Cortex_Tursiops NISSL

Predicted ID10_Aud_Cortex_Tursiops Predicted

Ground Truth ID10_Aud_Cortex_Tursiops Ground Truth

I ran inference on the Cytodark dataset using the default parameters, but I noticed that a significant number of cells were mismatched between the predicted and ground truth results. I have attached the original Nissl image, the predicted output, and the ground truth image for reference.

Could you recommend hyperparameters or adjustments that can be applied to improve the cell segmentation results on this dataset? Any advice on tuning the model for better accuracy would be greatly appreciated.

eliasbaumann commented 1 week ago

Hey @shubhankar-git , thanks for opening this issue and reaching out! HoVer-NeXt was not trained on cytology data so its not validated for this kind of data. However, you could do a number of things:

From your question I could not get whether you trained the model on this dataset as well? If not, I would highly recommend doing that. Then you should optimize hyperparameters using the hyperparameter search script. Both training and hp-search are here https://github.com/digitalpathologybern/hover_next_train
We are using object removal based on size in the post-processing. You can adjust those variables by changing the constants in https://github.com/digitalpathologybern/hover_next_inference/blob/main/src/constants.py.

If I misunderstood what you are trying to do, please let me know as well!

shubhankar-git commented 1 week ago

Hey @eliasbaumann Thanks for the response and the suggestions!

To clarify, I did not train the model on the Cytodark dataset; I used the pretrained lizard_convnextv2_large model for inference. While I am not planning to train the model myself, I will definitely try the hyperparameter tuning and post-processing adjustments you mentioned.

I'll look into the changes in the constants file and also experiment with the hyperparameter search script for optimization. I’ll keep you posted on how it goes and get back to you if I run into any issues.

Thanks again for your help!

eliasbaumann commented 1 week ago

You could start by setting MAX_THRESHS_LIZARD to very large numbers, e.g. 100000. But even then, the model will always try to find nuclei and not cells. And in the dataset you are using, the entire cell is annotated.