ActiveVisionLab / gaussctrl

[ECCV 2024] GaussCtrl: Multi-View Consistent Text-Driven 3D Gaussian Splatting Editing
https://gaussctrl.active.vision/
BSD 3-Clause "New" or "Revised" License
56 stars 4 forks source link

inversion with edited prompt #13

Open yeobinhong opened 3 days ago

yeobinhong commented 3 days ago

the code here (line 138 in gc_pipeline.py) seems to invert the original image to latents zT using edited prompts (p hat in the paper). As self.positive_prompt is an edited prompt, e.g. 'a photo of a polar bear in the forest, best quality, extremely detailed'.

latent, _ = self.pipe(prompt=self.positive_prompt, # placeholder here, since cfg=0 num_inference_steps=self.num_inference_steps, latents=init_latent, image=disparity, return_dict=False, guidance_scale=0, output_type='latent')

jingwu2121 commented 3 days ago

Hi, there, sorry for the confusion. As I suggested in the code comment, empirically, the prompt is more of a placeholder here, and the classifier-free guidance is set to 0. So In this case, the prompt doesn't have much effect on the inversion. Using the original prompt to invert is also possible. I will fix this later. :)