simongraham / hovernet_inference

HoVer-Net inference code for simultaneous nuclear segmentation and classification
64 stars 20 forks source link

Interpret the hover-net result .npz file #22

Closed az7jh2 closed 4 years ago

az7jh2 commented 4 years ago

@simongraham Hi. I tried the hover net inference on some patient WSIs. I can run the code sucessfully, and get a .npz file as the result. But I don't know how to interpret it, and overlap the predictions on the orginal WSI images.

For example, I use one WSI for test, which has a level 0 dimension 43774 129274 (height width). I used the suggested command to run hover-net

python run.py --gpu='0' --mode='wsi' --model='hovernet.npz' --input_dir='wsi_dir' --output_dir='output' --return_masks

The result .npz file has 3 arrays. The array for mask has a shape (118445, ) and each element of the array is a matrix with a shape (11, 7). And there are also 118,445 centroids and predicted types. I'm confused at these arrays, and don't know how to transform them back to the original level 0 dimension.

I found little instructions about it. Maybe it's a easy job for you, but please forgive me as I'm a clinical staff not a computer scientist.

simongraham commented 4 years ago

Hi @az7jh2 ,

As you have already mentioned, the wsi mode returns a 'npz' file which contains the cropped masks, centroid coordinates and the predicted nuclei types. Note, you mentioned that the masks have shape (11,7), but in fact they should all be of different shapes, because we save the nuclei masks cropped at the bounding box. We saved the results in this way because the results are then easy to use for downstream analysis. I can add functionality so that a segmentation overlay is generated per tile. Please let me know if this is something that will be useful.

az7jh2 commented 4 years ago

@simongraham Yes, you are right. I checked the mask shapes and noticed that these masks have different shapes.

But I still don't figure out how to match the coordinates of one nuclei with the original WSI image. I guess I can get the row and column index of the nuclei in the WSI image based on the centroid (x, y) and the mask. I don't know what the controid means, and it's float number (so it can't be the index).

I notice some py files in the folder such as viz_utils.py. I guess it may help with overlapping the segmentation upon the WSI image, but I don't know how to call the functions.

I find there are little instructions about the possible usage of the centroid and mask. I think it will be helpful for clinical users if you can enrich the tutorial for the result. If you can provide some visualization functions together with their instructions, it's even better!

simongraham commented 4 years ago

@az7jh2 thanks for the feedback- this is very useful.

I will add an option for displaying the overlay for each tile in wsi mode and I will also create a small jupyter notebook on how to use the npz file results in downstream analysis. Note, the reason that these centroids are floats are because these have been calculated by taking the average of the (x,y) boudary coordinates for each nucleus. These can simply be rounded, but I will change this in the code so that these are now integers to avoid confusion. I will make these changes within the next few days.

Out of interest, what institution are you from? 😊

az7jh2 commented 4 years ago

@simongraham thank you very much for your kind help

But I still have some questions about the float centroid. For example, the first segmented nuclei has a centroid [16825.65957447, 10160.89361702], and a mask with shape (11, 7). So I guess the calculation to get the x coordinate of the top-left point of the bounding box is 16825.65957447 - 11/2=16820.1595. It's not very closed to 16820. I'm not sure whether the offset 0.1595 is meaningful in imaging processing. But I just wander, if you calculate the centroid by taking the average of the (x,y) boudary coordinates, then when I calculate the (x,y) boudary coordinate in a reverse manner, the result should be very closed to intergers.

I am now in school of public health, Yale university to learning statitics. And I'm not trained in computer science. So you can see why I can't fully understand the terminology in computer science. One of my cooperator is a Pathologist in hospital, and he want to evaluate the accuracy of nuclei segmentation and classification on their own WSIs. So he asked me to help him to implement the hover-net in the server, and then show the results. But I can't figure out how to generate the overlapped result to show to him. I'm really appreciate that you can add the displaying function. It's very helpful! :joy:

simongraham commented 4 years ago

@az7jh2 - the calculation of the centroid is not done by finding the centre of the bounding box, but instead is found by calculating the moments of the boundary coordinates. For example, the mean of all of the x-coordinates and then mean of all of the y coordinates.

I am now in school of public health, Yale university to learning statistics

Okay good to know ☺️. I will add some functionality this weekend and will notify on here when it is complete.

az7jh2 commented 4 years ago

@simongraham I look forward your new functinalities.

And I guess I also need to learn some terms such as the boundind box in order to fully understand the segmentation...

simongraham commented 4 years ago

Hi @az7jh2 ,

I have added a jupyter notebook that should help you with the usage of the code. The overlay is restricted to roi mode for now, but there is a class at the end of the notebook that you may use to pre-extract tiles. You can then process each tile in roi mode and get the overlay.