apple / ml-hypersim

Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding
Other
1.67k stars 129 forks source link

Evermotion Scenes Compatibility & Camera Placement. #41

Closed arjunsinghrathore closed 2 years ago

arjunsinghrathore commented 2 years ago

Hi Team, hope that you are doing well! I had 2 queries and I'll list them down as follow:

  1. Are .fbx, .obj file compatible with v-ray scene and blender python API? Can we make them work somehow?

  2. How are the bounds for the camera placement in the scene is decided such that the camera does not wander off outside the indoor scene boundary? Is there some efficient and clean way of finding the proper boundaries of the evermotion indoor scenes such that maximum diversity can be achieved when trying to generate the dataset on my own?

Any kind of help on these questions would be really appreciated!

mikeroberts3000 commented 2 years ago

Hi Arjun, great questions!

  1. I'm not sure I totally understand your question. This one is phrased slightly differently to the one you sent me over email. I'm assuming you're asking if the Hypersim scenes can be somehow exported in a way that is compatible with Blender. I haven't had much luck exporting the Hypersim scenes in a way that can be rendered correctly by other renderers. I'm not saying it's impossible, but I tried several different approaches for a few weeks in the early stages of Hypersim development and didn't have much luck.   It is easy to export each scene as an OBJ file, but this only represents the scene geometry, and doesn't contain sufficient material and lighting information to perform phtotorealistic rendering. It is also easy to export the scene as a VRSCENE file, which is basically a giant text file (in a format designed by V-Ray) that contains all information needed for rendering. V-Ray provides a convenient Python API for parsing VRSCENE files, so in principle you could write your own converter to extract all the scene information you need and convert it into the desired Blender format. But there are a gazillion configuration options, material types, light types, etc in a typical VRSCENE file. So it seems like it would take forever to write such a converter.

  2. We have an imperfect way of preventing our random walk camera trajectories from wandering outside the intended viewing region of each scene. We basically tuned some heuristics by hand that encourage diverse salient views, while discouraging our cameras from wandering off. Our heuristics work well most of the time, and when they fail badly, we manually remove those camera trajectories from our dataset. We discuss this topic in our paper and in the supplementary material.   When designing our heuristics, we were limited in what we could do because we wanted to support rendering and semantically labeling the scenes in parallel. In other words, we wanted to generate camera trajectories without relying on semantic segmentation information. But now that the scenes have been labeled, of course you could select better views if you somehow used the semantic information. See the related work section of our paper for a discussion of alternative view selection algorithms that leverage semantic information in different ways.