shepnerd / inpainting_gmcnn

Image Inpainting via Generative Multi-column Convolutional Neural Networks, NeurIPS2018
MIT License
428 stars 98 forks source link

Issues regarding input images with already present holes #22

Closed djongpo closed 5 years ago

djongpo commented 5 years ago

I have tested your Tensorflow implementattion code on your already present data set in imgs/places2_256x256 folder. Now I want to test with my own image. My image already has some hole present. It is given in this link

Accordingly I tried modifying your code by eliminating the following portion from your test.py file if h >= config.img_shapes[0] and w >= config.img_shapes[1]: h_start = (h-config.img_shapes[0]) // 2 w_start = (w-config.img_shapes[1]) // 2 image = image[h_start: h_start+config.img_shapes[0], w_start: w_start+config.img_shapes[1], :]

else: t = min(h, w) image = image[(h-t)//2:(h-t)//2+t, (w-t)//2:(w-t)//2+t, :] image = cv2.resize(image, (config.img_shapes[1], config.img_shapes[0])) image = image * (1-mask) + 255 * mask

I want only to fill the holes in the images through your code. I already have images in which holes are present . Kindly say how to do the same?

I can visualize that I need to apply mask on the portion to be refilled and then run the session, but how to do the same? Here is the portion I guess I need to change but I wonder how? for i in range(test_num): if config.mask_type == 'rect': mask = generate_mask_rect(config.img_shapes, config.mask_shapes, config.random_mask) else: mask = generate_mask_stroke(im_size=(config.img_shapes[0], config.img_shapes[1]), parts=8, maxBrushWidth=24, maxLength=100, maxVertex=20)

THANKS in advance. Waiting for your reply.

shepnerd commented 5 years ago

To test your own masks, you need to create the mask in binary format besides of the degraded input. To create your own masks, you could refer to the gui [code] (https://github.com/shepnerd/inpainting_gmcnn/blob/master/tensorflow/painter_gmcnn.py) (function fill() and paint()) for creating strokes, or refer to test.py for creating rectangle masks.

To make it clear, this code is designed for inpainting images with the given masks (user's marks or interactions). So you need give images with holes and the indictors about where to inpaint (the binary mask).

djongpo commented 5 years ago

Hi, I did this slight modification to your test.py file for creating my own mask.

depth

I had to expand the dimensions of mask as it throws error Traceback (most recent call last): File "test.py", line 72, in <module> image = image * (1-mask) + 255 * mask ValueError: operands could not be broadcast together with shapes (256,256,3) (256,256)

What is the specific need of the following line? image = image * (1-mask) + 255 * mask

I think you should know the input image I have: 5 (1)

But the output is not inpainted where holes are present. Where am I missing out?

I think I am using binary mask which you are not. Is it so?

Waiting for a reply.

shepnerd commented 5 years ago

I suppose it is caused by the value range you used for variable mask. As you mentioned, the used mask is a binary one so its pixel value should be 1 or 0. A quick fix can be like mask[k][l].fill(1) or mask = mask / 255.0.

djongpo commented 5 years ago

Sorry, it doesn't work then also. The same problem persists. The image is not inpainted where holes were present. I can quickly check it as I have already provided the image. I think the problem is in the following few lines. And I think that expanding dimensions of mask was not a good idea, that I have done.

Capture

WAITING FOR YOUR REPLY. Thanks in advance.

djongpo commented 5 years ago

Can you please tell me what is the exact purpose of the line image = image * (1-mask) + 255 * mask because it throws error as ValueError: operands could not be broadcast together with shapes (256,256,3) (256,256) if I don't expand the dimensions of the mask.

shepnerd commented 5 years ago

That line is meant to mark the regions to be inpainted in white and then we could know exactly what the input looks like. Your error is caused by the dimensionality of mask. Since image is in shape [h, w, 3] while mask is in shape [h, w]. That's why we put an extra dimension to the newly created mask.

About why the current code didn't process the requested inpainting task, I think it would be better to visualize your created mask and check it. It seems using 'pixel.value == (255,255,255)' is not a good idea, because the pixel values are not identical to (255,255,255) near the mask border. There would a small gap remaining between existing pixels and the ones to be predicted. This may also lead to the processing failure.

djongpo notifications@github.com 于2019年6月27日周四 下午9:52写道:

Can you please tell me what is the exact purpose of the line image = image (1-mask) + 255 mask because it throws error as ValueError: operands could not be broadcast together with shapes (256,256,3) (256,256) if I don't expand the dimensions of the mask.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/shepnerd/inpainting_gmcnn/issues/22?email_source=notifications&email_token=ABGU6CBA3NAOD36EPXBOOILP4TAZZA5CNFSM4H3PGWTKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYXF5DA#issuecomment-506355340, or mute the thread https://github.com/notifications/unsubscribe-auth/ABGU6CDXIX2QSSUPYMHYXNTP4TAZZANCNFSM4H3PGWTA .

djongpo commented 5 years ago

Hey, sorry to bother you again. I want to train on my own dataset. What should be the full command for training given that my data is in Tensorflow/images/ directory with respect to the already provided sample python train.py --dataset [DATASET_NAME] --data_file [DATASET_TRAININGFILE] --gpu_ids [NUM] --pretrain_network 1 --batch_size 16

What is the dataset-trainingfile in this case? What is the use of giving the dataset name?

Also, after testing a few days ago I found that inpainting for rectangular type mask is not good, What is the reason?

shepnerd commented 5 years ago

The dataset-trainingfile is a file containing the image paths which are used for training, e.g., a file named as train.txt with the following contents:

/data/download/face/1.png
/data/download/face/2.png
...

The given dataset name is used for creating checkpoints.

About the inpainting rect mask with our method, our method performs well and stably on the datasets with a few categories (like face and streetview), but still has difficulties dealing with large-scale datasets with thousands of diverse object and scene categories, such as Places2. This is also given in the limitation of our paper.

Your mentioned inpainting performance problem mainly caused by: 1. insufficient and imbalanced training data: each category has around 0.4k~3k images, and it's obviously not enough for fitting a complex categories like campus. This may be lessened by other regularities like the popular non-local / self-attention or normalization modules. 2. Limited model capacity: compared with image synthesis tasks, our used model capacity is much smaller.