vinthony / deep-blind-watermark-removal

[AAAI 2021] Split then Refine: Stacked Attention-guided ResUNets for Blind Single Image Visible Watermark Removal
https://arxiv.org/abs/2012.07007
223 stars 55 forks source link

how to train the model with my own data? #2

Closed Asuna88 closed 3 years ago

Asuna88 commented 3 years ago

how to train the model with my own data? Do you have the training script?

Thx

vinthony commented 3 years ago

Hi, it is easy to train our model on your own dataset, consider you have the dataset with the following structure:

dataset
    - train
       - images # for the watermarked images
       - mask # for the binary mask
       - wm # for watermark images
       - natural # ground truth natural images
    - val
       - images # for the watermarked images
       - masks # for the binary mask
       -  wm # for watermark images
       - natural # ground truth natural images
  1. you only need to define the base_dir = /the/path/of/dataset on: https://github.com/vinthony/deep-blind-watermark-removal/blob/063c9662a57553d02f07ad335b716942572c80dd/examples/evaluate.sh#L24
  1. then, you need to modify the file locations in: https://github.com/vinthony/deep-blind-watermark-removal/blob/063c9662a57553d02f07ad335b716942572c80dd/scripts/datasets/COCO.py#L61-L65

  2. just running the first scripts from the in `bash scripts/evaluate.sh' to train the network.

If there are any bugs, please feel free to contact me.

Asuna88 commented 3 years ago

Hi, it is easy to train our model on your own dataset, consider you have the dataset with the following structure:

dataset
    - train
       - images # for the watermarked images
       - mask # for the binary mask
       - wm # for watermark images
       - natural # ground truth natural images
    - val
       - images # for the watermarked images
       - masks # for the binary mask
       -  wm # for watermark images
       - natural # ground truth natural images
1. you only need to define the `base_dir = /the/path/of/dataset` on:
   https://github.com/vinthony/deep-blind-watermark-removal/blob/063c9662a57553d02f07ad335b716942572c80dd/examples/evaluate.sh#L24

2. then, you need to modify the file locations in:
   https://github.com/vinthony/deep-blind-watermark-removal/blob/063c9662a57553d02f07ad335b716942572c80dd/scripts/datasets/COCO.py#L61-L65

3. just running the first scripts from the in `bash scripts/evaluate.sh' to train the network.

If there are any bugs, please feel free to contact me.

Thanks for your awesome project! by the way, you mentioned that "Besides training our methods, here, we also give an example of how to train the s2am under our farmewrok. ". So, if I want to train your model, do I need to clone 's2am' project and train 's2am'? I don't understand the relationship between this project and 's2am'.

(You mentioned:

  1. create my own data according to your data format
  2. replace it with my data and modify some code) Is that all?

So actually what should I do when training with my own data?

Thanks a lot !

vinthony commented 3 years ago

Hi, You do NOT need to clone the s2am project.

This project is based on s2am and here we provide an alternative way to train the s2am. Just remove these lines if you want to train this project alone:

https://github.com/vinthony/deep-blind-watermark-removal/blob/12e1dc0ef511e85923db4fbf4f33d1afcca79039/examples/evaluate.sh#L36-L50

Asuna88 commented 3 years ago

Thank you so much! One more question, when I train with my data using 'evaluate.sh' script, I got error like this: " File "/data/tools/anaconda3/lib/python3.8/site-packages/pytorch_ssim/init.py", line 57, in forward return _ssim(img1, img2, window, self.window_size, channel, self.size_average) File "/data/tools/anaconda3/lib/python3.8/site-packages/pytorch_ssim/init.py", line 18, in _ssim mu1 = F.conv2d(img1, window, padding = window_size/2, groups = channel) TypeError: conv2d(): argument 'padding' must be tuple of ints, not float

" I solved it by adding int(), just like mu1 = F.conv2d(img1, window, padding = int(window_size/2), groups = channel), but after that, it shows another error:

" File "/data/tools/anaconda3/lib/python3.8/site-packages/pytorch_ssim/init.py", line 57, in forward return _ssim(img1, img2, window, self.window_size, channel, self.size_average) File "/data/tools/anaconda3/lib/python3.8/site-packages/pytorch_ssim/init.py", line 18, in _ssim mu1 = F.conv2d(img1, window, padding = int(window_size/2), groups = channel) RuntimeError: Expected object of device type cuda but got device type cpu for argument #2 'weight' in call to _thnn_conv_depthwise2d_forward

"

I have no idea about it. What should I do now ? Could you please help me? I will be very appreciate for your help.

vinthony commented 3 years ago

I think it might be because of the errors in pytorch_ssim packages.

One option is to use the sk-image packages like here:

https://github.com/vinthony/deep-blind-watermark-removal/blob/12e1dc0ef511e85923db4fbf4f33d1afcca79039/scripts/machines/VX.py#L280-L286

From the errors, another option is to try to use the pytorch_ssim package on cpu by input the cpu tensors to the pytorch_ssim.ssim() in

https://github.com/vinthony/deep-blind-watermark-removal/blob/12e1dc0ef511e85923db4fbf4f33d1afcca79039/scripts/machines/VX.py#L220

hope it helps!

Asuna88 commented 3 years ago

I think it might be because of the errors in pytorch_ssim packages.

One option is to use the sk-image packages like here:

https://github.com/vinthony/deep-blind-watermark-removal/blob/12e1dc0ef511e85923db4fbf4f33d1afcca79039/scripts/machines/VX.py#L280-L286

From the errors, another option is to try to use the pytorch_ssim package on cpu by input the cpu tensors to the pytorch_ssim.ssim() in

https://github.com/vinthony/deep-blind-watermark-removal/blob/12e1dc0ef511e85923db4fbf4f33d1afcca79039/scripts/machines/VX.py#L220

hope it helps!

yes, you are right, I also find the error occurred in VX.py script and locate the errors in pytorch_ssim packages.

My VX.py is the same as you mentioned above, but it doesn't work. Actually I dont know what and where should I modify the code in VX.py script. What you mentioned above is actually the same as my VX.py code now. So do you know how to fix it?

Thanks a lot!

Asuna88 commented 3 years ago

this is all error info:

"

File "/deep-blind-watermark-removal/main.py", line 72, in main(args) File "/deep-blind-watermark-removal/main.py", line 42, in main Machine.train(epoch) File "//deep-blind-watermark-removal/scripts/machines/VX.py", line 128, in train l2_loss,att_loss,wm_loss,style_loss,ssim_loss = self.loss(outputs[0],self.norm(target),outputs[1],mask,outputs[2],self.norm(wm)) File "/data/tools/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, *kwargs) File "//deep-blind-watermark-removal/scripts/machines/VX.py", line 80, in forward ssim_loss = sum([ 1 - self.ssimloss(im,target) for im in recov_imgs]) File "//deep-blind-watermark-removal/scripts/machines/VX.py", line 80, in ssim_loss = sum([ 1 - self.ssimloss(im,target) for im in recov_imgs]) File "/data/tools/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(input, **kwargs) File "/data/tools/anaconda3/lib/python3.8/site-packages/pytorch_ssim/init.py", line 57, in forward return _ssim(img1, img2, window, self.window_size, channel, self.size_average) File "/data/tools/anaconda3/lib/python3.8/site-packages/pytorch_ssim/init.py", line 18, in _ssim mu1 = F.conv2d(img1, window, padding = int(window_size/2), groups = channel) RuntimeError: Expected object of device type cuda but got device type cpu for argument #2 'weight' in call to _thnn_conv_depthwise2d_forward

"

vinthony commented 3 years ago

Hi, I think it might be because of the usage of ssim_loss.

For a quick debug, you can just set the weight of ssim loss to zero to avoid it.

Maybe you do not need to put the ssim_loss to the cuda device in :

https://github.com/vinthony/deep-blind-watermark-removal/blob/12e1dc0ef511e85923db4fbf4f33d1afcca79039/scripts/machines/VX.py#L43

I am still curious about how it happens indeed, but it might need some time because I am working on other projects.

Asuna88 commented 3 years ago

Hi, I think it might be because of the usage of ssim_loss.

For a quick debug, you can just set the weight of ssim loss to zero to avoid it.

Maybe you do not need to put the ssim_loss to the cuda device in :

https://github.com/vinthony/deep-blind-watermark-removal/blob/12e1dc0ef511e85923db4fbf4f33d1afcca79039/scripts/machines/VX.py#L43

I am still curious about how it happens indeed, but it might need some time because I am working on other projects.

Thank you so much , I will try it again. I will leave message if sth. goes not well and hope you could give me suggestions in your spare time.

Anyway, thanks a lot!

Asuna88 commented 3 years ago

Hi, I think it might be because of the usage of ssim_loss.

For a quick debug, you can just set the weight of ssim loss to zero to avoid it.

Maybe you do not need to put the ssim_loss to the cuda device in :

deep-blind-watermark-removal/scripts/machines/VX.py

Line 43 in 12e1dc0 self.ssimloss = pytorch_ssim.SSIM().to(device)

I am still curious about how it happens indeed, but it might need some time because I am working on other projects.

Thanks a lot, I solved this problem by doing this, they are as follows:

  1. scripts/machines/VX.py +244 ssim = pytorch_ssim.ssim(imfinal.cpu(),target.cpu())

    adding .cpu() to put these tensors in cpu.

  2. vim /data/.../python3.8/site-packages/pytorch_ssim/init.py +18

    17 def _ssim(img1, img2, window, window_size, channel, size_average = True): 18 ¦ mu1 = F.conv2d(img1, window, padding = int(window_size/2), groups = channel) 19 ¦ mu2 = F.conv2d(img2, window, padding = int(window_size/2), groups = channel) 20
    21 ¦ mu1_sq = mu1.pow(2) 22 ¦ mu2_sq = mu2.pow(2) 23 ¦ mu1_mu2 = mu1mu2 24
    25 ¦ sigma1_sq = F.conv2d(img1
    img1, window, padding = int(windowsize/2), groups = channel) - mu1 26 ¦ sigma2_sq = F.conv2d(img2img2, window, padding = int(windowsize/2), groups = channel) - mu2 27 ¦ sigma12 = F.conv2d(img1img2, window, padding = int(window_size/2), groups = channel) - mu1_mu 28 ...

    adding int() and make it tuple. Just like padding = int(window_size/2)

It succeed! Thanks a lot!

One more question, Where did you ues pretrained model "27kpng_model_best.pth.tar", I dont see yet. How should I use this pretrained model "27kpng_model_best.pth.tar"?

Thanks a lot!

Asuna88 commented 3 years ago

Second question:

How to prepare my own datasets, do I need label it using some annotation tools like 'labelme'?

I know you mentioned : dataset

I just wonder if I need to label it and generate json file?

Or just prepare "1: original images, 2: binary mask, 3:watermark images, 4:ground truth images"? Is that all?

THx!

vinthony commented 3 years ago

Hi, I think it might be because of the usage of ssim_loss. For a quick debug, you can just set the weight of ssim loss to zero to avoid it. Maybe you do not need to put the ssim_loss to the cuda device in : deep-blind-watermark-removal/scripts/machines/VX.py Line 43 in 12e1dc0 self.ssimloss = pytorch_ssim.SSIM().to(device) I am still curious about how it happens indeed, but it might need some time because I am working on other projects.

Thanks a lot, I solved this problem by doing this, they are as follows:

  1. scripts/machines/VX.py +244 ssim = pytorch_ssim.ssim(imfinal.cpu(),target.cpu())

    adding .cpu() to put these tensors in cpu.

  2. vim /data/.../python3.8/site-packages/pytorch_ssim/init.py +18

17 def _ssim(img1, img2, window, window_size, channel, size_average = True): 18 ¦ mu1 = F.conv2d(img1, window, padding = int(window_size/2), groups = channel) 19 ¦ mu2 = F.conv2d(img2, window, padding = int(window_size/2), groups = channel) 20 21 ¦ mu1_sq = mu1.pow(2) 22 ¦ mu2_sq = mu2.pow(2) 23 ¦ mu1_mu2 = mu1_mu2 24 25 ¦ sigma1_sq = F.conv2d(img1_img1, window, padding = int(windowsize/2), groups = channel) - mu1 26 ¦ sigma2_sq = F.conv2d(img2_img2, window, padding = int(windowsize/2), groups = channel) - mu2 27 ¦ sigma12 = F.conv2d(img1_img2, window, padding = int(window_size/2), groups = channel) - mu1_mu 28 ...

adding int() and make it tuple. Just like padding = int(window_size/2)

It succeed! Thanks a lot!

One more question, Where did you ues pretrained model "27kpng_model_best.pth.tar", I dont see yet. How should I use this pretrained model "27kpng_model_best.pth.tar"?

Thanks a lot!

Hi, the pre-trained model is trained using our synthesized dataset.

If your dataset is small, you can try to load the pre-trained model and finetune it.

vinthony commented 3 years ago

Second question:

How to prepare my own datasets, do I need label it using some annotation tools like 'labelme'?

I know you mentioned : dataset

  • train
  • images # for the watermarked images
  • mask # for the binary mask
  • wm # for watermark images
  • natural # ground truth natural images
  • val
  • images # for the watermarked images
  • masks # for the binary mask
  • wm # for watermark images
  • natural # ground truth natural images

I just wonder if I need to label it and generate json file?

Or just prepare "1: original images, 2: binary mask, 3:watermark images, 4:ground truth images"? Is that all?

THx!

Our task is an image-to-image translation task. Thus, we do not need the label, just prepare the images as mentioned in a specific location.

Asuna88 commented 3 years ago

Got it. I succeed in running your code.

Thanks so much!

Asuna88 commented 3 years ago

Excuse me , can I know how to evaluate the result of removing watermarks?

Do you have some metric like mAP or IoU? I just wonder how to evaluate the results of removing watermarks.

Can I get some metric or something else?

Thx@!