About proc_kumar_ann.py

vqdang / hover_net

Simultaneous Nuclear Instance Segmentation and Classification in H&E Histology Images.

MIT License

532 stars 224 forks source link

About proc_kumar_ann.py #14

Closed GekFreeman closed 4 years ago

GekFreeman commented 4 years ago

Hi. Thank you for the awesome code. I have a problem to understand this code on the 63rd line of proc_kumar_ann.py :

for idx, inst_map in enumerate(insts_list):
    ann[inst_map > 0] = idx + 1

Why the values of annotation are given the idx of insts_list? Doesn't this lose the segmentation information?

vqdang commented 4 years ago

Hi, you have to read the whole portion carefully to understand that, so check back. https://github.com/vqdang/hover_net/blob/d743e633ed59e588af6113cae185d4db589b4368/src/misc/proc_kumar_ann.py#L39 https://github.com/vqdang/hover_net/blob/d743e633ed59e588af6113cae185d4db589b4368/src/misc/proc_kumar_ann.py#L51 So essentially, inst_list by line 52 is a list of N images of size HxW , each which contains 1 single nuclei with non-zero ID. And then, https://github.com/vqdang/hover_net/blob/d743e633ed59e588af6113cae185d4db589b4368/src/misc/proc_kumar_ann.py#L62-L65 simply combines NxHxW to create HxW ann i.e flatten inst_list down. Each inst_map is, again, an image of size HxW and contains 1 single nuclei with non-zero ID. So, when making the ann, we assign that nuclei in the inst_map with the new ID, indicated by the idx+1.

Well, with the above explanation, I don't think

Doesn't this lose the segmentation information?

would happen.

simongraham commented 4 years ago

Also, if your intention for this question was to convert the xml Kumar labels to a 2D array, then you can also use the labels that we have organised in the google drive link below:

https://drive.google.com/drive/folders/1l55cv3DuY-f7-JotDN7N5nbNnjbLWchK

Here, for CPM and Kumar the Labels are stored in an array size HxW where each nucleus is given a unique value. i.e the nuclei are labelled from 1 to N, where N is the number of nuclei in the image and 0 is the background. For CoNSeP, the Labels are stored in an array size HxWx2. The first channel is the same as described above and the second channel indicates the type of nucleus. For a more in depth description of this, look at the README file when downloading the CoNSeP dataset from this page.

GekFreeman commented 4 years ago

Thanks for your timely explanation! I have figured out why this preserves the segmentation information!

GekFreeman commented 4 years ago

Also, if your intention for this question was to convert the xml Kumar labels to a 2D array, then you can also use the labels that we have organised in the google drive link below:

https://drive.google.com/drive/folders/1l55cv3DuY-f7-JotDN7N5nbNnjbLWchK

Here, for CPM and Kumar the Labels are stored in an array size HxW where each nucleus is given a unique value. i.e the nuclei are labelled from 1 to N, where N is the number of nuclei in the image and 0 is the background. For CoNSeP, the Labels are stored in an array size HxWx2. The first channel is the same as described above and the second channel indicates the type of nucleus. For a more in depth description of this, look at the README file when downloading the CoNSeP dataset from this page.

This link is very useful, thanks so much!

simongraham commented 4 years ago

We are now closing this issue. Thanks for using our code and let us know if you have any further questions 👍