DLR-RM / BlenderProc

A procedural Blender pipeline for photorealistic training image generation
GNU General Public License v3.0
2.82k stars 452 forks source link

How to set output image size and camera intrinsic by myself? #67

Closed greatwallet closed 3 years ago

greatwallet commented 4 years ago

Hi I am new to blenderproc and find it hard to make it "under control". I want to use blenderproc as a synthetic object-pose dataset, but it's hard for me to find the exact arguments for image size and intrinsic. Is there a way to render a pose with simply:

  1. camera intrinsic
  2. object pose
  3. mesh model, e.g. *.obj or *.ply
  4. image size

An example is bop_renderer, where only 4 arguments are needed and really easily "programmable".

themasterlink commented 4 years ago

Thanks for your question, I will first clarify some things and then answer to the best of my knowledge.

I want to use blenderproc as a synthetic object-pose dataset

Be aware that BlenderProc is not a dataset, but can help you create your own dataset.

Is there a way to render a pose with simply:

Yes, this is possible.

  1. camera intrinsic

Check the documentation for the CameraInterface, there you can specify the K-Matrix and other things: https://github.com/DLR-RM/BlenderProc/blob/787f1873d5dd786305728c9973849c6c5f50e083/src/camera/CameraInterface.py#L58

  1. object pose

There are several ways of setting the object pose, you can either set it via an EntityManipulator see: https://github.com/DLR-RM/BlenderProc/blob/787f1873d5dd786305728c9973849c6c5f50e083/examples/entity_manipulation/config.yaml#L60-L74

If you here replace the provider for the key location with a fixed list, e.g. [1,2,3]. Then this position will be used.

  1. mesh model for .obj and/or .ply

The ObjectLoader can load all of these kinds of models.

https://github.com/DLR-RM/BlenderProc/blob/787f1873d5dd786305728c9973849c6c5f50e083/examples/basic/config.yaml#L20-L23

  1. Image Size

The resolution can be set over the global config value:

https://github.com/DLR-RM/BlenderProc/blob/787f1873d5dd786305728c9973849c6c5f50e083/examples/stereo_matching/config.yaml#L17-L24

An example is bop_renderer, where only 4 arguments are needed and really easily "programmable".

You can decide for yourself, which arguments you want to use, whenever you use:

https://github.com/DLR-RM/BlenderProc/blob/787f1873d5dd786305728c9973849c6c5f50e083/examples/stereo_matching/config.yaml#L29

The argument will be placed in that position in the config file, the number specifies the position in the program call.

Does this answer all of your questions?

PS: I would advise to start with our basic example and then work your way to more complex examples. So that you get a full understanding of our module structure and how they affect the rendering of a scene.

greatwallet commented 4 years ago

Thanks for your patient explanation!

greatwallet commented 4 years ago

Hi, I went through the tutorial and got on business. However, I still encountered some trouble here.

What I want to do

How I encounter the problem

Path to a 3D object model (in PLY format).

model_path = './obj_000001.ply' obj_id = 1

Path to output RGB and depth images.

out_rgb_path = 'out_rgb.png' out_depth_path = 'out_depth.png'

Object pose and camera parameters.

Rt = np.loadtxt("frame0001.txt") R = Rt[:3, :3].flatten().tolist() t = Rt[:3, -1].flatten().tolist() fx, fy, cx, cy = 2100., 2100., 540., 960. im_size = (1080, 1920) ################################################################################

where the `"frame0001.txt"`'s content is:
```txt
0.99953197 -0.030522982 0.0020441833 463.24894
-0.02697507 -0.8478881 0.52948853 -86.584825
-0.01442833 -0.52929586 -0.84831463 2630.9245

which should be the homogeneous transformation from object to camera coordination. and the rendering result is as below: out_rgb

Expected Suggestion

  1. How should I change the config and get the expected output, given the the homogeneous transformation from object to camera coordination?
  2. Would you mind explaning what mistake I have made in the process above?
MartinSmeyer commented 4 years ago

Hi @greatwallet,

several things

1.) There could be an issue because you provide the location in mm and not in m. Is the scale of your model in mm as well? 2.) The BOPLoader can load single objects of a BOP dataset and takes care of the mm2m conversion. See this example. So you might want to use that. 3.) You are right, internally we just added the option to load homogeneous matrices in the CameraLoader. We will release it very soon. We will also add an option for setting 4x4 matrices for objects directly in the config as well. Anyways if you want to debug, you can check your conversion to euler/location by printing

CameraLoader._cam2world_matrix_from_cam_extrinsics(config)

4.) The entity manipulator should be able to set the 4x4 pose of the object/camera directly. The attribute that you want to change is called matrix_world:

        {
      "module": "manipulators.EntityManipulator",
      "config": {
        "selector": {
          "provider": "getter.Entity",
          "conditions": {
            "name": 'obj_000001',
            "type": "MESH" # this guarantees that the object is a mesh, and not for example a camera
          }
        },
        "matrix_world": [[0.99953197,-0.030522982,0.0020441833,463.24894],[-0.02697507,-0.8478881,0.52948853,-86.584825],[-0.01442833,-0.52929586,-0.84831463,2630.9245],[0.,0.,0.,1.]],
        "cp_physics": True
      }
    },

Possibly you need to use the inverse and/or convert to m. You can also use the getter.Entity to fetch the camera and do the same.

DavidLSmyth commented 4 years ago

If you haven't already explored it, to debug it's useful to not run blender in background mode but rather using the UI:

(to run the shape net example in background mode): blender -b --python ./src/run.py -- ./examples/shapenet/config.yaml

(to run the shape net example using the UI): blender --python ./src/run.py -- ./examples/shapenet/config.yaml

Then you can visually inspect the scene setup using the UI.

themasterlink commented 4 years ago

@DavidLSmyth you could also use the src/debug.py script for such instances, avoids the restarting of blender for each run.

Only side effect is that some of the variables might not get removed without a restart.

greatwallet commented 4 years ago

Thank you for your advice @DavidLSmyth @MartinSmeyer and it's a good way to debug. However, I've still not managed to achieve the rendering goal of the given object pose and model.

  1. For object model's measure scale, I have checked the *.ply in text editor and assured that the scale is in millimeter, which is consistent with the scale of translation vector in the 4x4 homogeneous transformation matrix.
  2. When I use src/debug.py inside blender, I found that the default scale is meter, and using millimeter is hard for visualization (because the object would be huge and the scrollable area of the world seems to be limited in blender). So I rescaled obj_000001.ply to obj_000001_m.ply (the latter one is 1000 times smaller than the former one, hence transformed to meter). Under the following discussion, the scale is meter by default.

When I used the following config:

   # for camera
       {
      "module": "camera.CameraLoader",
      "config": {
        "path": "/home/cxt/workspace/scripts/BlenderProc/kwai/cfg/camera_origin.txt",
        # note: the content in camera_origin.txt is 
        # 0 0 0 0 0 0
        "file_format": "location rotation/value",

        "default_cam_param": {
          "cam_K": [2100., 0, 540., 0, 2100., 960., 0, 0 ,1],
          "resolution_x": 1080,
          "resolution_y": 1920, 
        }
      }
    },
   # for object
    {
      "module": "manipulators.EntityManipulator",
      "config": {
        "selector": {
          "provider": "getter.Entity",
          "conditions": {
            "name": 'obj_000001_m',
            "type": "MESH" # this guarantees that the object is a mesh, and not for example a camera
          }
        },
        "location": [0.4632489400,0.0865848250,2.6309245000],
        "rotation_euler": [-2.5833025,0.0401909,0.0086565],
        "cp_physics": True
      }
    },

but the scene looks like below: modify_obj

it seems like that the camera defined in blender is centro-symmetric towards the common one. The z axis points downwards, which seems contrary to the definition a normal pin-hole camera.

Then I simply turn the z-value of the object to be negative, trying to render under this scope. modify_obj_downwards Nonetheless I found that the object is not in the FOV of the camera, hence nothing gets rendered still.

So I am curious how the camera model is defined in blender. The same parameter gets well-rendered image under bop_renderer, which treats camera as a simple pin-hole camera. How come that blender fail to render this?

P.S. In order to avoid stupid bugs of the value homogeneous transformation, I have tried inverse of the matrix, (also 1000x, 0.001x of translation), but similar problem happened.

greatwallet commented 4 years ago

obj_000001_m.ply.txt Here is the model file. Removing the suffix .txt would work.

MartinSmeyer commented 4 years ago

Thanks, we will soon provide an example that covers a similar usecase since there seem to be some confusions.

greatwallet commented 4 years ago

Thanks, we will soon provide an example that covers a similar usecase since there seem to be some confusions.

Thank you for your reply, I think I got some clue here, approching to the answer. The location and rotation value in blender are not the homogeneous transformation from camera to world, but are "object" local coordinates. There is a transformation function according to rfabbri's answer in "https://blender.stackexchange.com/questions/38009/3x4-camera-matrix-from-blender-camera"

So there is a transformation function:

a function convert 3x3 rotation matrix to 3x1 euler rotation vectors.

def mat2euler(mat): return Rotation.from_matrix(mat).as_rotvec()

a pre-defined rotation matrix FROM blender-defined camera TO common-defined camera,

where the latter one's coordination is identical to the world.

Note: the y-axis and z-axis is turned upside-down

R_cam2BlenderCam = np.asarray([ [1., 0., 0.], [0., -1., 0.], [0., 0., -1.] ])

Note: "@" means matrix multiplication in python3

rot_mat = (R_cam2BlenderCam @ R_world2cam).T cam.rotation = mat2euler(rot_mat) cam.location = - rot_mat @ R_cam2BlenderCam @ t_world2cam



After doint this and reset object to the origin, I found the rendering result more reasonable, but still far from the ground truth. (The object's rotation seems to make sense, but there's still huge distance in translation). I think perhaps there is some weird setting with `object.rotation` and `object.location` in blenderproc too, just like `cam.location & rotation`? 
MartinSmeyer commented 4 years ago

I think you are right. With the cameras we do this transformation: https://github.com/DLR-RM/BlenderProc/blob/81e1dd53d63b20c5cbef8bb6946ff359a2788911/src/camera/CameraInterface.py#L196

I was always using object.matrix to set object poses. With that we have exactly replicated the real scenes of the BOP datasets (see this example). But here I set the camera and object poses in the BOPloader. We will come back to you

greatwallet commented 4 years ago

@MartinSmeyer Hi, did you get any clue yet?

Today, I have modified work on bop_scene_replication to adapt to my scenario. However, the results are still NOT correct either. Screenshot from 2020-11-06 21-20-29

Is it because the camera model defined in blender is too complicated? bop_renderer sees camera as a pin-hole camera with perspective projection.

BTW, does it influence the result if width is less than height in quantity, like my case (which is 1080x1920)?

It's kinda painful for me to organize my work and forward to you guys, but if you're interested, I will find time to organize the code and send it to you.

MartinSmeyer commented 4 years ago

Hi @greatwallet , We were just drowning in other deadlines. We have an internal PR on that topic going should be ready soon.

The camera model defined in blender is indeed quite complicated and if the height is larger than widths it does have an influence. Are your results with horizontal images okay? I can guarantee you that the scene_replication example is correct for all BOP datasets, but they are all with horizontal images.

BTW, I know what bop_renderer does, I helped developing it.

greatwallet commented 4 years ago

Hi @greatwallet , We were just drowning in other deadlines. We have an internal PR on that topic going should be ready soon.

The camera model defined in blender is indeed quite complicated and if the height is larger than widths it does have an influence. Are your results with horizontal images okay? I can guarantee you that the scene_replication example is correct for all BOP datasets, but they are all with horizontal images.

BTW, I know what bop_renderer does, I helped developing it.

Yes I tried with horizontal image(640x480) and the result is correct.

themasterlink commented 3 years ago

This should be fixed by now, if not please open a new issue.

MartinSmeyer commented 3 years ago

@greatwallet yes, sorry for the late reaction.

You should be able to render your vertical image now. We changed the structure of the config to more easily load objects and set their poses in homog coords and set camera intrinsics.

See for the example here: basic_object_pose

kwea123 commented 3 years ago

@MartinSmeyer Is there an instruction on how to generate the right image (https://github.com/DLR-RM/BlenderProc/blob/master/examples/basic_object_pose/hb_val_3_0.png) in basic_object_pose?

themasterlink commented 3 years ago

@MartinSmeyer Is there an instruction on how to generate the right image (https://github.com/DLR-RM/BlenderProc/blob/master/examples/basic_object_pose/hb_val_3_0.png) in basic_object_pose?

Would you be so kind and open a new issue for this, as this is a separate question. To make it easier for others to find.