Open mohiiieldin opened 6 months ago
If you just duplicate the image then there is no paralllax (motion between frames) for obstruction removal, so the method will not work. You will need to sample from a video with some hand or scene motion in it.
@Ilya-Muromets Thanks for you reply,
If i sampled frames using opencv from a normal video not taken by your android app will it work?
Also when sampling what's your recommendation to the number of frames and the interval between each sampled frame?
Yeah that should work; in general the approach is tuned towards handheld burst photography settings, where you have ~1cm of camera motion. If you look at the preview videos on our website: https://light.princeton.edu/publication/nsf/ it should give you a feel of how much motion there is in our captures.
Okay got it, will try and see.
Thanks for the reply
Did you open source your data set thay you used for training?
You can download scenes from the links in the README. There's no "training data" per se, since the network is trained from scratch for every scene (similar to a NeRF).
Thanks 🙏
I tested using the following video that I shot using my mobile camera:
and these are the sampled frames from the videos using opencv:
now I started seeing results but they are not as good as in the demo:
Do you have any thoughts about what can be improved?
I also tried to fit the model on 5 images using:
but got a white image the same as normal images, so isn't this burst feature in the mobile camera the one you meant?
Nice, vid1.mp4 looks closer to the intended data. The motion there is still very large, however. You can try just recording natural hand motion (i.e. try to keep your hand still while recording the video).
Tried with another video with less motion on it (just natural hand motion):
but the output was pretty same like the input:
output is the right image
Interesting, try passing in the first 30 frames of that? I think this kind of occluder should work pretty well, but there's definitely room for improvement
I was passing 5 frames with systematic sampling, I will try 30 frames now but should I use occlusion.sh
or occlusion-wild.sh
?
Can try both! Try first 30 frames, maybe will also need to set the size of the occlusion alpha to "large". You can find that in the config
I tried both with 30 frames and tried also to set the size of the occlusion alpha to "large", final result is better but still the fence is visible.
Should I play with the focus, make the focus behind the fence not on it for example, or this is not relevant?
Focus should be fine, the z-motion (camera moving towards the occluder) might be a bit difficult, but shouldn't be unreasonable to estimate.
I'm working towards a SIGGRAPH Asia submission so unfortunately can't really help debug too much right now, but am happy to chat about the method more at a later time. In general I encourage playing with the settings (e.g., small/med/large encodings for stuff, how many camera control points there are) and maybe try passing it even less motion (e.g., just the last 10 frames)
@Ilya-Muromets Thanks for your help.
When you are available for more debugging just ping me.
I tried running section 3 on this image:
and duplicated it 5 times to match the number of examples in the tutorial, I trained for 50 epoch and this is the final output:
all the pixels = 1 in transmission, reference, and obstruction: