dicarlolab / ThreeDWorld

Generator of interactive Unity-based 3D environments with physics
Other
21 stars 4 forks source link

Actually generate scenes and dataset #47

Open honeybunches-of-oates opened 7 years ago

yamins81 commented 7 years ago

@pbashivan could you provide a description of where we're at on this issue? I think a big problem here is that the size control has not yet been really deal with (see issue #56)

pbashivan commented 7 years ago

I wrote a script running a loop to:

  1. Generate a random scene
  2. Put the avatar at random locations with random viewing angles, record image, normals, segmentation, sceneInfo into a HDF5 file.
  3. Checks the size of each object in the scene, and remove the segmentation for that object if too small.
  4. Discard the sample if no object is present.
  5. Create a new random scene every N step
yamins81 commented 7 years ago

great where is this script?

pbashivan commented 7 years ago

On my mac. Do you want me to push it to the repo? Another note: Object labels are read from the MongoDB (['synthetic_generative']['3d_models']). Basically for each scene I create a table of all object IDs and labels and throughout the script find the labels from that table.

yamins81 commented 7 years ago

Yes you should push your code at the end of every day, even if it is to a non-master branch

sounds fine about reading from MongoDB

pbashivan commented 7 years ago

Where should I put the image_generation code. Ideally it should be part of Client_tools but I don't think I can put it there. Or can I?

yamins81 commented 7 years ago

This will involve:

(1) figuring out how often to reset the scene (2) which objects are used on each scene, and how to figure that out in a programmatic way (3) knowing which of the 5300 or so objects actually work and testing to make sure they look good (4) solving any issues with size (5) solving any issues with stacking and placement and seeing that the scenes don't look like a heap of garbage (6) deciding how many images we want and seeing how fast the actual generation process is (7) then actually generating, possibly in parallel (8) then assembling the hdf5

yamins81 commented 7 years ago

the goal in step 2-3 above is to make sure that the scenes are different from each other -- specifically that the collection of objects and spatial interrrelationship between them is random from scene to scene

yamins81 commented 7 years ago

sorry, there's also:

(6a) -- we need to figure out how we want the camera to move around in each scene. should it move randomly? should the avatar "follow" certain objects and get "good shots" of them?

pbashivan commented 7 years ago

Some of the objects like guns occupy very little area (counting the number of pixels) of the screen even when they are very close to the agent and therefore when the images are screened for object presence, they are rejected most of the time. In addition, I'm assigning scales according to the category of objects. For instance, higher scales for airplanes and trains. However, doing this results in most of the approved images contain only these objects in them.

We could assign higher scale values to thinner objects (like guitars and chairs) and lower to flatter ones (e.g. trains and buses).

qbilius commented 7 years ago

We decided to compute bounding boxes of the objects and use the bounding box area to decide if the object is large enough.

Additionally, we're going to restrict camera motion to be mostly parallel to the ground and not tilt too much so that the generated images have a good view of objects and not ground or walls (though the latter might be too much of a restriction).

On Sun, Nov 13, 2016, 17:45 Pouya notifications@github.com wrote:

Some of the objects like guns turn occupy very little area (counting the number of pixels) of the screen even when they are very close to the agent and therefore when the images are screened for object presence, they are rejected most of the time. In addition, I'm assigning scales according to the category of objects. For instance, higher scales for airplanes and trains. However, doing this results in most of the approved images contain only these objects in them.

We could assign higher scale values to thinner objects (like guitars and chairs) and lower to flatter ones (e.g. trains and buses).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/dicarlolab/ThreeDWorld/issues/47#issuecomment-260218843, or mute the thread https://github.com/notifications/unsubscribe-auth/ABMaMXbZt8YGxhAfXPJM3SzoIWUsP8KVks5q95MZgaJpZM4Jib18 .

yamins81 commented 7 years ago

1) @pbashivan is worried about sizes -- let's specify sizes via synsets -- which can be at whatever desired level in the wordnet hierarchy, to specify sizes for categories

2) maybe for now we should set sizes randomly per object (not per category) on each sscene to get variability

3) eventually we do need to do the issue where we can canonical sizes (and maybe angles) for each object via Mturk but let's ignore that for now.

4) maybe @pbashivan wants to go toward objects (see the curiosity code) to get them better into view and the whole trajectory of the agent during that process can be part of the dataset