Open TianyuanYang opened 1 year ago
You can set the bbox by changing it in the config file you're using, but the best values depend on the data. If you have an unbounded scene where you are using scene contraction (e.g. our Phototourism configs), then the bbox should be -2 to 2 in each dimension. If you have a bounded scene or forward-facing scene then there is some manual tuning to choose a bounding box; making it tighter will increase effective resolution, but too tight and you'll cut something off. One way you can do the tuning is to train for a bit and then visualize the planes, and then you can tell if you are using the dimensions effectively and make any adjustments. Note also that even for a scene with contraction, you'll get the best performance if you make sure the scene is centered and scaled so that it uses most of the planes; you can tune these with "global_scale" and "global_translation" parameters in the config.
Hi, how did you visualize the planes? I'm thinking of how TensoRF visualized its planes. What approach did you use?
We did plane visualization by averaging over the feature dimension, and plotting as a grayscale image. This is sufficient for figuring out the right dimensions for the bbox (and how to allocate resolution among the planes).
Thanks for the answers! But I'm quite confused by what you mean by "how to allocate resolution among the planes". Can you explain what you meant?
Normally you should see recognizable parts of your scene on the planes. If a large part of the plotted planes (say the left half) appears to be noise instead of parts of the scene, then you can reduce the bbox in the relevant dimension. So you can start with a large bbox, then plot and noting which parts of the planes are just noise, you can find out how to reduce the bbox size.
Right, so you can adjust the bbox based on which parts of the planes are used, and you can separately adjust the resolution in each dimension to match the desired aspect ratio. For example, the Trevi fountain scene is wider than it is tall, so you can adjust the x resolution to be higher than the y resolution.
I am facing a similar problem with my own data, but I am not sure whether the scene_bbox is the problem. Could scaling problems also result from x,y,z coordinates that are not centered or not properly normalized? Is there an easy way to check for this?
Yes, it does help if the scene is centered around the origin in the world coordinates (corresponding to the center of the planes). For example, for the Phototourism dataset we manually found a global translation and scaling in x, y, z for each scene (e.g. https://github.com/sarafridov/K-Planes/blob/main/plenoxels/configs/final/Phototourism/trevi_explicit.py#L11) based on visualizing the planes and adjusting until the scene is centered and roughly fills the planes.
Thank you for your excellent work! I have encountered a problem when applying the code to custom datasets. How can I set the value of "scene_bbox" on a custom dataset? And how are the values of "scene_bbox" in the provided configuration file determined? Looking forward to your answer.