Closed LiquidAmmonia closed 2 years ago
In the stylization (editing) process, these rays are not randomly sampled without any semantics. Please see the 'get_select_inds' function in run_nerf_clip.py.
And you should set a large value for 'sample_scale' to make sure a clear patch (A small value will lead to sparsity/low-resolution. And a larger one may lead to OOM. It depends on your GPU).
Thank you for your reply. I will try that.
Hi, according to your code the original nerf training process, one would randomly choose a batch of rays (say 1024) in the training process and compare this to the ground truth pixel-wise values of the ground truth image sampled by the same set of rays. So the image sent to compute the clip loss is just a batch of random pixels, without any semantic information. Is my understanding correct? And if so, why would it be possible to compare this 'image' to the input prompt?
This is the image sent to the clip loss during your training process.