pomelyu / paper-reading-notes

0 stars 0 forks source link

2023 [SIGGRAPH] (DragGAN) Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold #8

Open pomelyu opened 1 year ago

pomelyu commented 1 year ago

Introduction

This paper provides an optimization-based approach to interactively edit the image generated by the pertained unconditional GAN. Specifically, user select the control points on the images and then use this approach to move these points to the target location. User can also specify the un-touched region by the mask.

image

Method

image

1. Motion Supervision: move the control points toward the target positions

image

2. Point Tracking: figure out the current control point location**

image

3. Repeat the above two steps several times until the control points close to the target positions.

Highlight

Limitation

Comments

[^1]: UserControllableLT: User-Controllable Latent Transformer for StyleGAN Image Layout Editing [^2]: RAFT - Recurrent All Pairs Field Transforms for Optical Flow [^3]: PIPS - Particle Video Revisited: Tracking Through Occlusions Using Point Trajectories

alexxzibit commented 1 year ago

Introduction

This paper provides an optimization-based approach to interactively edit the image generated by the pertained unconditional GAN. Specifically, user select the control points on the images and then use this approach to move these points to the target location. User can also specify the un-touched region by the mask. image

Method

image ### 1. Motion Supervision: move the control points toward the target positions image

  • pi: control points, ti: target points, qi: the surrounding region of pi, di is the normal vector from pi to ti
  • F(q): the feature vector of the point q from the 6th StyleGAN layers
  • (?) Optimize this loss to move pi to pi+di

2. Point Tracking: figure out the current control point location**

image
  • Find the current control points by similarity of the feature map
  • The general optical flow estimation network(RAFT1, PIPS2) can be used, but they will lead to the worse results

3. Repeat the above two steps several times until the control points close to the target positions.

Highlight

  • general method for the GAN
  • the movement can be accurate compare to the similar research UserContrallableLT3
  • can be applied to real image through the GAN inversion

Limitation

  • slow due to the optimization approach
  • no explicit constraint to preserve the appearance although we can use mask to preserve the unchanged regions

Comments

Footnotes

  1. RAFT - Recurrent All Pairs Field Transforms for Optical Flow
  2. PIPS - Particle Video Revisited: Tracking Through Occlusions Using Point Trajectories
  3. UserControllableLT: User-Controllable Latent Transformer for StyleGAN Image Layout Editing

Introduction

This paper provides an optimization-based approach to interactively edit the image generated by the pertained unconditional GAN. Specifically, user select the control points on the images and then use this approach to move these points to the target location. User can also specify the un-touched region by the mask. image

Method

image ### 1. Motion Supervision: move the control points toward the target positions image

  • pi: control points, ti: target points, qi: the surrounding region of pi, di is the normal vector from pi to ti
  • F(q): the feature vector of the point q from the 6th StyleGAN layers
  • (?) Optimize this loss to move pi to pi+di

2. Point Tracking: figure out the current control point location**

image
  • Find the current control points by similarity of the feature map
  • The general optical flow estimation network(RAFT1, PIPS2) can be used, but they will lead to the worse results

3. Repeat the above two steps several times until the control points close to the target positions.

Highlight

  • general method for the GAN
  • the movement can be accurate compare to the similar research UserContrallableLT3
  • can be applied to real image through the GAN inversion

Limitation

  • slow due to the optimization approach
  • no explicit constraint to preserve the appearance although we can use mask to preserve the unchanged regions

Comments

Footnotes

  1. RAFT - Recurrent All Pairs Field Transforms for Optical Flow
  2. PIPS - Particle Video Revisited: Tracking Through Occlusions Using Point Trajectories
  3. UserControllableLT: User-Controllable Latent Transformer for StyleGAN Image Layout Editing