Kohulan / DECIMER-Image-Segmentation

Chemical structure detection and segmentation tool for Journal articles.
https://decimer.ai
MIT License
81 stars 30 forks source link

IndexError: index 0 is out of bounds for axis 0 with size 0 #30

Closed rookiexiao123 closed 2 years ago

rookiexiao123 commented 3 years ago
x_center = np.where(mask_array[y_center] == True)[0][0]

IndexError: index 0 is out of bounds for axis 0 with size 0. it is in complete_structure.py. it is find_mask_center() function. ''' if mask_array[y_center, x_center]: return x_center, y_center else:

If the global mask center is not placed in the mask, take the center on the x-axis and the first-best y-coordinate that lies in the mask

    x_center = np.where(mask_array[y_center] == True)[0][0]
    return x_center, y_center

''' ===>mask_array[y_center, x_center] == True, np.where(mask_array[y_center] == True)[0][0]====>mask_array[y_center] == False, What should I do about this?

OBrink commented 3 years ago

@rookiexiao123 Do you have a specific example where this error occurs? Are you using DECIMER Segmentation from the command line or are you importing the module? Can you inspect the mask returned by the model as described in this notebook? How does the mask that you are trying to expand look like? I recently had a similar issue when I ran DECIMER Segmentation in an environment with Tensorflow 2.5 which resulted in nonsensical output by the model. Can you confirm that you are running it with Tensorflow 2.3?

rookiexiao123 commented 3 years ago

thank you.I had the issue when I ran the mask expansion. I use another segmentation algorithm to get masks.then used your mask expansion algorithm.now I use pytorch.masks looks normal.It just can't find the center point.It's strange that this happens to a very few images, and others are quite normal.

rookiexiao123 commented 3 years ago

image Like this, there is a problem sending in the mask extension. I suspect that there are several pieces in a mask, which causes this error. I wonder if you have any suggestions.

OBrink commented 3 years ago

Ok. I think I see the problem here. If I interpret the colour coding right, your model sometimes returns multiple "blobs" as one instance. For example, the green and blue mask visualisations in your image each cover multiple objects. If you want to expand the masks from there, the question is where to start.

Our code simply defines a point between highest and lowest x- and y-values (==True) as the mask center. For the example of the blue mask in your image, that point might not be covered by the original mask at all (and it does not cover the structure). The final version of our model reliably returns one connected object of True values in the mask instead of multiple. Hence, we did not observe this problem in our evaluation. You say that this only happens sometimes - I would strongly suspect that it happens whenever one instance is returned as a mask that consists of multiple separated "blobs".

Now, the question is what to do in this case. There is an alternative method for the expansion which we tested before we realised that the final version was more efficient and reliable. Here, polygons which describe the contours of the mask are determined. If there are multiple polygons (which is the case here), only the biggest one (simply defined as the biggest difference between y_min and y_max) is used. Seeds for the expansion are simply defined as non-white pixels on the mask contours and the pixel-wise expansion is started from there. This was supposed to automatically happen when no seed pixels could be found. I think it did not happen in our evaluation because the model only ever returned a single blob and the center point determination was not a problem. Hence, I did not notice that there was a problem here.

I have pushed a fixed version into the "experimental" branch of the DECIMER Segmentation repository. My problem is that I cannot test this edge case because I do not have images where the the model returns multiple "blobs" as one instance. It would be very helpful if you could pull the code from the branch and report if it fixes the problem.

Kind regards, Otto

rookiexiao123 commented 3 years ago

thank you @Otto. I'll test it now.

rookiexiao123 commented 3 years ago

image It really doesn't report errors, but I don't think it succeeded in putting other redundant polygon boxes

rookiexiao123 commented 3 years ago

image I modified it this way, and it doesn't report errors. But the mask image is the same as what you asked me to test just now.

OBrink commented 3 years ago

Mhm. Sorry, I was wrong. the polygons contours are just used for the seed pixel determination. This was originally done so that nothing which might be relevant is deleted. Hence, I am even more confused that the some masks are deleted in the image above (when comparing it to the first image) You could write a more complex function for the mask center determination (Replace find_mask_center()). The problem here is that you are going to let that function make some decisions that the segmentation model was originally supposed to make. If there are multiple blobs, they can all be included or you just take one of them. We did not really have this problem as our model returns single blobs as instances.

rookiexiao123 commented 3 years ago

ok,thank you very much.Find out where the problem is and solve it slowly.

rookiexiao123 commented 3 years ago

I simulated a data and tried it. It should solve the problem that a mask has multiple parts. Use the part in your code that selects the largest polygon, then change the largest polygon to a mask, and then perform an initial operation. It worked.I made this mask like picture1. image The original result of the code:It will report an error. After modification, it does not report an error, but it does not expand.this is picture 2. image I modified the results like picture3. image

rookiexiao123 commented 3 years ago

Modified a function and added a function.

def polygons_to_mask(img_shape, polygons):
    mask = np.zeros(img_shape, dtype=np.uint8)
    mask = PIL.Image.fromarray(mask)
    xy = list(map(tuple, polygons))
    PIL.ImageDraw.Draw(mask).polygon(xy=xy, outline=1, fill=1)
    mask = np.asarray(mask).astype('bool')
    return mask

def expansion_coordination(mask_array: np.array, image_array: np.array) -> np.array:
    """This function takes a single mask and an image (np.array) and coordinates
    the mask expansion. It returns the expanded mask.
    The purpose of this function is wrapping up the expansion procedure in a map function."""
    # seed_pixels = find_seeds(image_array, mask_array)
    # if seed_pixels != []:
    #   mask_array = expand_masks(image_array, seed_pixels, mask_array)
    # else:
    #   # If the seed detection inside of the mask has failed for some reason, look for seeds on the contours of the mask and expand from there on.
    #   # Turn masks into list of polygon bounding boxes
    #   polygon = mask_2_polygons(mask_array)
    #   # Delete unnecessary mask blobs
    #   polygon = define_relevant_polygons(polygon)
    #   seed_pixels = find_seeds_contours(image_array=image_array, bounding_box=polygon[0])
    #   mask_array = expand_masks(image_array, seed_pixels, mask_array, contour_expansion=True)
    # return mask_array
    try:
        seed_pixels = find_seeds(image_array, mask_array)
        if seed_pixels != []:
            mask_array = expand_masks(image_array, seed_pixels, mask_array)
        else:
            # If the seed detection inside of the mask has failed for some reason, look for seeds on the contours of the mask and expand from there on.
            # Turn masks into list of polygon bounding boxes
            polygon = mask_2_polygons(mask_array)
            # Delete unnecessary mask blobs
            polygon = define_relevant_polygons(polygon)
            seed_pixels = find_seeds_contours(image_array=image_array, bounding_box=polygon[0])
            mask_array = expand_masks(image_array, seed_pixels, mask_array, contour_expansion=True)
    except:
        polygon = mask_2_polygons(mask_array)
        # Delete unnecessary mask blobs
        polygon = define_relevant_polygons(polygon)
        mask_array = polygons_to_mask(image_array.shape, polygon[0])
        seed_pixels = find_seeds(image_array, mask_array)

        if seed_pixels != []:
            mask_array = expand_masks(image_array, seed_pixels, mask_array)
        else:
            # If the seed detection inside of the mask has failed for some reason, look for seeds on the contours of the mask and expand from there on.
            # Turn masks into list of polygon bounding boxes
            polygon = mask_2_polygons(mask_array)
            # Delete unnecessary mask blobs
            polygon = define_relevant_polygons(polygon)
            seed_pixels = find_seeds_contours(image_array=image_array, bounding_box=polygon[0])
            mask_array = expand_masks(image_array, seed_pixels, mask_array, contour_expansion=True)

    return mask_array

Is there anything I need to pay attention to?