Change the Masking method for inpainting

We want to change the way we mask the original text from the image in order to remove it from the image. Our current approach is as follows:

Crop the image section which contains the merged text section
Grab its dimensions and generate an array of size height - border_size * 2 and width - border_size * 2 with each cell containing 255 value, (white colour)
Add a border around the array of size border_size with each cell containing 0 value, (black colour)
Inpaint the cropped section using the mask with cv2.inpaint() with inpaint radius of 7 and INPAINT_NS.
Replace the cropped section in the original image with this new inpainted section

There are few issues with this method.

What is a good inpainting radius? Is 7 really the best possible value in our cases?
What is a good inpainting algorithm. cv2.INPAINT_NS is the fluid mechanic inpainting algorithm from the 2001 paper. Is this really the "optimal" algorithm for us? Is there other algorithm that we can use that is better than this?
The merged text section may contain other smaller text section. However, in doing so, there are empty spaces that does not contain texts. In our current approach, even these empty spaces will be replaced by the inpainting algorithm since the mask simply draws a rectangular dimension based on the merged text section. We want to change this so that we only mask on rectangular areas where the text section contains. This means that we must loop over merged text section's children and mask them individually.
This inpainting method works mostly on text bubbles. However, it does not work properly at all if the text is not inside a text bubble, but is in front of some other background. After the section has been inpainted, the image's backgrounds on that area will be completely ruined. We want to be able to support restoring it so we don't lose meaning of the image. This concept is similar to of Photoshop's Generative AI.

maeriil / Aoriil