ProGamerGov / Neural-Tools

Tools made for usage alongside artistic style transfer projects
184 stars 30 forks source link

Have you try 'Spatial Control' #1

Open SilvesterHsu opened 7 years ago

SilvesterHsu commented 7 years ago

As it is shown in the paper "Controlling Perceptual Factors in Neural Style Transfer", there is spatial control, have you try it? (https://arxiv.org/abs/1611.07865).

ProGamerGov commented 7 years ago

@SSeanHsu I haven't delved too much into spatial control yet.

ProGamerGov commented 7 years ago

@SSeanHsu This project here: https://github.com/martinbenson/deep-photo-styletransfer, has multi-region spatial control in an older version of Neural-Style. But I am having trouble converting the code to the updated Neural-Style code.

I separated the relevant mask code from the project, here: https://gist.github.com/ProGamerGov/18b90683816e8e82795597afa1d6c96c

SilvesterHsu commented 7 years ago

@ProGamerGov Ok, I got it. Recently, I am working with the receptive field. Here I find a good Blog which illustrations are awesome! I am still new in this area, but I'll keep learning. I read about your code, but I don't know what does torch.gt and torch.lt mean. Can you attach the page of explanation for me. Because I didn't find any explanation on the page of Torch. Thanks a lot:D

ProGamerGov commented 7 years ago

@SSeanHsu This page has information on torch.gt and torch.lt https://github.com/torch/torch7/blob/master/doc/maths.md

They are logical comparison operators, which the creator admit was a lazy way to go about dealing with colored masks: https://github.com/luanfujun/deep-photo-styletransfer/issues/37

SilvesterHsu commented 7 years ago

@ProGamerGov It is really an interesting way to add mask. It is fun to think about. Have you succeed loading the multi-masks? I add my masks manually. And I only have two masks temporarily. What's more, I recheck the references of Deep Photo Style Transfer. It seems they doesn't involve all the work of Controlling Perceptual Factors in Neural Style Transfer. Maybe that's why some of their results are not good enough. I installed Matlab on my Ubuntu yesterday. And I will try to run the code.

ProGamerGov commented 7 years ago

@SSeanHsu

It is really an interesting way to add mask. It is fun to think about

Yea, it's definitely an upgrade over the black and white only masks that Gatys' code uses, but sadly it lacks the other things that make his code work better.

Maybe that's why some of their results are not good enough.

The reason why the outputs are not very artistic, is because the project was aiming for photo realistic outputs. They also don't use any eroded guidance channels/dilated loss functions (though I may attempt to add them).

The photo realism features can be made optional, for use, as I did in this modified neural_style.lua (The laplacian creator can be found here).

But the mask implementation is one of the most advanced that I have seen thus far for Neural-Style. You can pick and choose parts of the style image (you can even omit parts), for transfer to the content image, and/or transfer specific style image regions, to specific content image regions.

I installed Matlab on my Ubuntu yesterday.

This fork removes the Matlab dependency: https://github.com/martinbenson/deep-photo-styletransfer

This issue details the progress in trying to get this segmentation working in the current version of Neural-Style: https://github.com/martinbenson/deep-photo-styletransfer/issues/22

ProGamerGov commented 7 years ago

@SSeanHsu How the progress coming with the Receptive Field Spatial Control? Is it the same thing as the "eroded guidance channels" talked about in Gatys' "Controlling Perceptual Factors in Neural Style Transfer" research paper?

SilvesterHsu commented 7 years ago

@ProGamerGov I do have some results with the receptive field spatial control with eroded guidance channels which is processed in Photoshop. But I don't think "eroded guidance channels" is a trustworthy way to deal with the neurons near the boundaries. On the contrary, I believe we should calculate the accurate receptive fields. What's more, the forth page of this paper says "We define spatial guidance channels only on the neurons whose receptive field is entirely inside the guidance region and add another global guidance channel that is constant over the entire image". I don't find any detail information about global guidance. So I just leave it alone. In my results, I use my own images. I will use the images on the paper for checking the contribution of eroded guidance channels. I will upload my images of results later. It is really nice to see your reply :D