Closed subin-kim-cv closed 1 year ago
Hi! For this experiment, we had to render full images in order to diffuse them properly. However, since SDS loss requires the image to carry its gradients such that the backpropagation can reach the NeRF, it becomes very costly to render high-resolution images. Thus, similar to DreamFusion, we take the train cameras and rescale them to be 64x64 resolution. Then, we render the NeRF at this low resolution and perform the SDS loss. This is primarily why the results seem so poor.
Hopefully that helps!
Thanks for your awesome work!
I have one question about the SDS + instructPix2Pix experiments in your paper. What patch size (in a single batch) you used in your implementation of SDS + instructPix2Pix?
Thanks :)