Closed greatwallet closed 3 years ago
Thanks for your question, I will first clarify some things and then answer to the best of my knowledge.
I want to use blenderproc as a synthetic object-pose dataset
Be aware that BlenderProc is not a dataset, but can help you create your own dataset.
Is there a way to render a pose with simply:
Yes, this is possible.
- camera intrinsic
Check the documentation for the CameraInterface
, there you can specify the K-Matrix and other things:
https://github.com/DLR-RM/BlenderProc/blob/787f1873d5dd786305728c9973849c6c5f50e083/src/camera/CameraInterface.py#L58
- object pose
There are several ways of setting the object pose, you can either set it via an EntityManipulator
see:
https://github.com/DLR-RM/BlenderProc/blob/787f1873d5dd786305728c9973849c6c5f50e083/examples/entity_manipulation/config.yaml#L60-L74
If you here replace the provider
for the key location
with a fixed list, e.g. [1,2,3]
. Then this position will be used.
- mesh model for
.obj
and/or.ply
The ObjectLoader can load all of these kinds of models.
- Image Size
The resolution can be set over the global config value:
An example is bop_renderer, where only 4 arguments are needed and really easily "programmable".
You can decide for yourself, which arguments you want to use, whenever you use:
The argument will be placed in that position in the config file, the number specifies the position in the program call.
Does this answer all of your questions?
PS: I would advise to start with our basic example and then work your way to more complex examples. So that you get a full understanding of our module structure and how they affect the rendering of a scene.
Thanks for your patient explanation!
Hi, I went through the tutorial and got on business. However, I still encountered some trouble here.
*.ply
or *.obj
), and the dataset should satisfy
*.txt
or BOP-like annotated json files), providing identity, rotation and translation information of each object in the frame. data: YCB-V dataset: obj_000001.ply
(texture-less)
Actually, I used to use bop_renderer
(https://github.com/thodan/bop_renderer) for synthetic dataset generation. I modified the script of bop_renderer/samples/renderer_minimal.py
as following:
# L11-L28
# PARAMETERS.
################################################################################
# Path to bop_renderer.
bop_renderer_path = '/home/cxt/workspace/scripts/bop_renderer/build'
model_path = './obj_000001.ply' obj_id = 1
out_rgb_path = 'out_rgb.png' out_depth_path = 'out_depth.png'
Rt = np.loadtxt("frame0001.txt") R = Rt[:3, :3].flatten().tolist() t = Rt[:3, -1].flatten().tolist() fx, fy, cx, cy = 2100., 2100., 540., 960. im_size = (1080, 1920) ################################################################################
where the `"frame0001.txt"`'s content is:
```txt
0.99953197 -0.030522982 0.0020441833 463.24894
-0.02697507 -0.8478881 0.52948853 -86.584825
-0.01442833 -0.52929586 -0.84831463 2630.9245
which should be the homogeneous transformation from object to camera coordination. and the rendering result is as below:
However, when I use blenderproc
and changed the config file of basic example
as
...
{
"module": "camera.CameraLoader",
"config": {
"path": "<args:0>",
"file_format": "location rotation/value",
"default_cam_param": {
# "rotation/format": "look_at",
"fov": 1,
"interocular_distance": 0.05,
"stereo_convergence_mode": "PARALLEL",
"convergence_distance": 0.00001,
"cam_K": [2100., 0, 540., 0, 2100., 960., 0, 0 ,1],
"resolution_x": 1080,
"resolution_y": 1920,
}
}
},
...
where the "camera_positions
contains:
463.2489400 -86.5848250 2630.9245000 -2.5833025 0.0401909 0.0086565
which is transformed via scipy.spatial.transform.Rotation
but I got NOTHING in the output. (all blank filled with 0)
Also, I have tried manipulators.EntityManipulator
and place the inversion of the homogeneous transformation value inside, but still got zero-valued image as output.
...
{
"module": "manipulators.EntityManipulator",
"config": {
"selector": {
"provider": "getter.Entity",
"conditions": {
"name": 'obj_000001',
"type": "MESH" # this guarantees that the object is a mesh, and not for example a camera
}
},
"location": [-427.4079140,1333.2629477,2276.7504598],
"rotation_euler": [2.5833025,-0.0401909, -0.0086565],
"cp_physics": True
}
},
...
I have also tried "look_up" value in CameraLoader
, but still in vain.
Hi @greatwallet,
several things
1.) There could be an issue because you provide the location in mm and not in m. Is the scale of your model in mm as well? 2.) The BOPLoader can load single objects of a BOP dataset and takes care of the mm2m conversion. See this example. So you might want to use that. 3.) You are right, internally we just added the option to load homogeneous matrices in the CameraLoader. We will release it very soon. We will also add an option for setting 4x4 matrices for objects directly in the config as well. Anyways if you want to debug, you can check your conversion to euler/location by printing
CameraLoader._cam2world_matrix_from_cam_extrinsics(config)
4.) The entity manipulator should be able to set the 4x4 pose of the object/camera directly. The attribute that you want to change is called matrix_world
:
{
"module": "manipulators.EntityManipulator",
"config": {
"selector": {
"provider": "getter.Entity",
"conditions": {
"name": 'obj_000001',
"type": "MESH" # this guarantees that the object is a mesh, and not for example a camera
}
},
"matrix_world": [[0.99953197,-0.030522982,0.0020441833,463.24894],[-0.02697507,-0.8478881,0.52948853,-86.584825],[-0.01442833,-0.52929586,-0.84831463,2630.9245],[0.,0.,0.,1.]],
"cp_physics": True
}
},
Possibly you need to use the inverse and/or convert to m. You can also use the getter.Entity to fetch the camera and do the same.
If you haven't already explored it, to debug it's useful to not run blender in background mode but rather using the UI:
(to run the shape net example in background mode):
blender -b --python ./src/run.py -- ./examples/shapenet/config.yaml
(to run the shape net example using the UI):
blender --python ./src/run.py -- ./examples/shapenet/config.yaml
Then you can visually inspect the scene setup using the UI.
@DavidLSmyth you could also use the src/debug.py script for such instances, avoids the restarting of blender for each run.
Only side effect is that some of the variables might not get removed without a restart.
Thank you for your advice @DavidLSmyth @MartinSmeyer and it's a good way to debug. However, I've still not managed to achieve the rendering goal of the given object pose and model.
*.ply
in text editor and assured that the scale is in millimeter, which is consistent with the scale of translation vector in the 4x4 homogeneous transformation matrix. src/debug.py
inside blender, I found that the default scale is meter, and using millimeter is hard for visualization (because the object would be huge and the scrollable area of the world seems to be limited in blender). So I rescaled obj_000001.ply
to obj_000001_m.ply
(the latter one is 1000 times smaller than the former one, hence transformed to meter). Under the following discussion, the scale is meter by default.When I used the following config:
# for camera
{
"module": "camera.CameraLoader",
"config": {
"path": "/home/cxt/workspace/scripts/BlenderProc/kwai/cfg/camera_origin.txt",
# note: the content in camera_origin.txt is
# 0 0 0 0 0 0
"file_format": "location rotation/value",
"default_cam_param": {
"cam_K": [2100., 0, 540., 0, 2100., 960., 0, 0 ,1],
"resolution_x": 1080,
"resolution_y": 1920,
}
}
},
# for object
{
"module": "manipulators.EntityManipulator",
"config": {
"selector": {
"provider": "getter.Entity",
"conditions": {
"name": 'obj_000001_m',
"type": "MESH" # this guarantees that the object is a mesh, and not for example a camera
}
},
"location": [0.4632489400,0.0865848250,2.6309245000],
"rotation_euler": [-2.5833025,0.0401909,0.0086565],
"cp_physics": True
}
},
but the scene looks like below:
it seems like that the camera defined in blender is centro-symmetric towards the common one. The z axis points downwards, which seems contrary to the definition a normal pin-hole camera.
Then I simply turn the z-value of the object to be negative, trying to render under this scope. Nonetheless I found that the object is not in the FOV of the camera, hence nothing gets rendered still.
So I am curious how the camera model is defined in blender. The same parameter gets well-rendered image under bop_renderer
, which treats camera as a simple pin-hole camera. How come that blender fail to render this?
P.S. In order to avoid stupid bugs of the value homogeneous transformation, I have tried inverse of the matrix, (also 1000x, 0.001x of translation), but similar problem happened.
obj_000001_m.ply.txt
Here is the model file. Removing the suffix .txt
would work.
Thanks, we will soon provide an example that covers a similar usecase since there seem to be some confusions.
Thanks, we will soon provide an example that covers a similar usecase since there seem to be some confusions.
Thank you for your reply, I think I got some clue here, approching to the answer.
The location
and rotation
value in blender are not the homogeneous transformation from camera to world, but are "object" local coordinates. There is a transformation function according to rfabbri's answer in "https://blender.stackexchange.com/questions/38009/3x4-camera-matrix-from-blender-camera"
So there is a transformation function:
R_world2cam
and 3x1 translation t_world2cam
, defined as homogenous transformation FROM world TO camera.cam.location
and 3x1 rotation cam.rotation
, defined in blender, i.e. the input of camera.CameraLoader
in blenderproc.
import numpy as np
from scipy.spatial.transform import Rotation
def mat2euler(mat): return Rotation.from_matrix(mat).as_rotvec()
R_cam2BlenderCam = np.asarray([ [1., 0., 0.], [0., -1., 0.], [0., 0., -1.] ])
rot_mat = (R_cam2BlenderCam @ R_world2cam).T cam.rotation = mat2euler(rot_mat) cam.location = - rot_mat @ R_cam2BlenderCam @ t_world2cam
After doint this and reset object to the origin, I found the rendering result more reasonable, but still far from the ground truth. (The object's rotation seems to make sense, but there's still huge distance in translation). I think perhaps there is some weird setting with `object.rotation` and `object.location` in blenderproc too, just like `cam.location & rotation`?
I think you are right. With the cameras we do this transformation: https://github.com/DLR-RM/BlenderProc/blob/81e1dd53d63b20c5cbef8bb6946ff359a2788911/src/camera/CameraInterface.py#L196
I was always using object.matrix
to set object poses. With that we have exactly replicated the real scenes of the BOP datasets (see this example). But here I set the camera and object poses in the BOPloader. We will come back to you
@MartinSmeyer Hi, did you get any clue yet?
Today, I have modified work on bop_scene_replication to adapt to my scenario. However, the results are still NOT correct either.
Is it because the camera model defined in blender is too complicated? bop_renderer
sees camera as a pin-hole camera with perspective projection.
BTW, does it influence the result if width is less than height in quantity, like my case (which is 1080x1920
)?
It's kinda painful for me to organize my work and forward to you guys, but if you're interested, I will find time to organize the code and send it to you.
Hi @greatwallet , We were just drowning in other deadlines. We have an internal PR on that topic going should be ready soon.
The camera model defined in blender is indeed quite complicated and if the height is larger than widths it does have an influence. Are your results with horizontal images okay? I can guarantee you that the scene_replication example is correct for all BOP datasets, but they are all with horizontal images.
BTW, I know what bop_renderer
does, I helped developing it.
Hi @greatwallet , We were just drowning in other deadlines. We have an internal PR on that topic going should be ready soon.
The camera model defined in blender is indeed quite complicated and if the height is larger than widths it does have an influence. Are your results with horizontal images okay? I can guarantee you that the scene_replication example is correct for all BOP datasets, but they are all with horizontal images.
BTW, I know what
bop_renderer
does, I helped developing it.
Yes I tried with horizontal image(640x480) and the result is correct.
This should be fixed by now, if not please open a new issue.
@greatwallet yes, sorry for the late reaction.
You should be able to render your vertical image now. We changed the structure of the config to more easily load objects and set their poses in homog coords and set camera intrinsics.
See for the example here: basic_object_pose
@MartinSmeyer Is there an instruction on how to generate the right image (https://github.com/DLR-RM/BlenderProc/blob/master/examples/basic_object_pose/hb_val_3_0.png) in basic_object_pose?
@MartinSmeyer Is there an instruction on how to generate the right image (https://github.com/DLR-RM/BlenderProc/blob/master/examples/basic_object_pose/hb_val_3_0.png) in basic_object_pose?
Would you be so kind and open a new issue for this, as this is a separate question. To make it easier for others to find.
Hi I am new to blenderproc and find it hard to make it "under control". I want to use
blenderproc
as a synthetic object-pose dataset, but it's hard for me to find the exact arguments for image size and intrinsic. Is there a way to render a pose with simply:*.obj
or*.ply
An example is
bop_renderer
, where only 4 arguments are needed and really easily "programmable".