Closed npcdna closed 11 months ago
Hi, Some of the format descriptions are introduced here, you can start by processing your data into this format.
Our preprocess process on Waymo Open Dataset also follows the format above, only with certain specifications. You can refer to this script if your run into any problems. But it might not cover all of your problems since many procedures are specially designed for WOD.
Feel free to ask questions here, and I'm also working on a tutorial for training on custom datasets.
hi,friend. i run waymo data sucessfully, Afterwards, I converted my data set to your format according to the instructions, and then adjusted my configuration according to the waymo configuration, but I failed to run it, whether it was adding lasers or using depth maps. Here is the problem:
2023-09-01 10:11:51,646-rk0-train.py#959:=> Start loading data, for experiment: logs/streetsurf/owndata_2
2023-09-01 10:11:51,646-rk0-train.py#962:=> Done loading data.
2023-09-01 10:11:51,647-rk0-checkpoint.py#74:=> Found ckpts: ['logs/streetsurf/owndata_2/ckpts/0.pt']
2023-09-01 10:11:51,647-rk0-checkpoint.py#78:=> Loading checkpoint from local file: logs/streetsurf/owndata_2/ckpts/0.pt
2023-09-01 10:11:51,797-rk0-train.py#182:=> Start initialize prepcess...
2023-09-01 10:11:51,797-rk0-train.py#204:=> Done initialize prepcess.
2023-09-01 10:11:51,797-rk0-checkpoint.py#41:=> Saving ckpt to logs/streetsurf/owndata_2/ckpts/0.pt
2023-09-01 10:11:52,170-rk0-checkpoint.py#46:Done.
2023-09-01 10:11:52,170-rk0-train.py#1057:=> Start [train], it=0, lr=1e-05, in logs/streetsurf/owndata_2
0%| | 0/12000 [00:00<?, ?it/sError occurred in: logs/streetsurf/owndata_2
0%| | 0/12000 [00:04<?, ?it/s]
Traceback (most recent call last):
File "code_single/tools/train.py", line 1303, in
it seems config setting error, i did not know how to solve it, my img size: 1920*1280, Apart from the file path and using the camera laser type, are there any other parameters that need to be adjusted? here is my dataset format: ├── depths │ ├── camera_FRONT │ │ ├── 00000000.npz | | |... │ │ └── 00000187.npz │ └── camera_REAR │ ├── 00000000.npz | |.... │ └── 00000187.npz ├── images │ ├── camera_FRONT │ │ ├── 00000000.jpg | | |... │ │ └── 00000187.jpg │ └── camera_REAR │ ├── 00000000.jpg | |... │ └── 00000187.jpg ├── lidars │ └── lidar_TOP │ ├── 00000000.npz | |.... │ └── 00000187.npz ├── masks │ ├── camera_FRONT │ │ ├── 00000000.npz | | |... │ │ └── 00000187.npz │ └── camera_REAR │ ├── 00000000.npz | |... │ └── 00000187.npz ├── normals │ ├── camera_FRONT │ │ ├── 00000000.jpg | | .... │ │ └── 00000187.jpg │ └── camera_REAR │ ├── 00000000.jpg | .... │ └── 00000187.jpg └── scenario.pt
here is my config use lidar:
#------------------------------------------------------------
#------------ Some shortcut configs
#------------------------------------------------------------
device_ids: -1
num_rays_pixel: 4096
num_rays_lidar: 4096
near: 0.1
far: 200.0
depth_max: 120.0 # To visualize / colorize depth when render/eval
extend_size: 60.0
num_coarse: 128 # Number of coarse samples on each ray
step_size: 0.2 # Ray-marching step size
upsample_inv_s: 64.0
upsample_inv_s_factors: [1., 4., 16.]
num_fine: [8,8,32] # [8,8,8] # Number of samples of 3 upsample stages
radius_scale_min: 1 # Nearest sampling shell of NeRF++ background (Distant-view model)
radius_scale_max: 1000 # Furthest sampling shell of NeRF++ background (Distant-view model)
distant_interval_type: inverse_proportional
distant_mode: fixed_cuboid_shells
distant_nsample: 64
sdf_scale: 25.0 # The real-world length represented by one unit of SDF
rgb_fn: l1
rgb_fn_param: {}
lidar_fn: l1
lidar_fn_param: {}
w_lidar: 0.02
w_los: 0.1
# eps_los: annal1.5_0.75_0.5
# w_mask: 0.3
num_uniform: ${eval:"2**16"}
w_eikonal: 0.01
on_render_ratio: 0.1
on_occ_ratio: 1.0
on_render_type: both
safe_mse: true
errlim: 5
w_sparsity: 0.002
sparsity_anneal_for: 1000
sparsity_enable_after: 0
clbeta: 10.0
clw: 0.2
clearance_sdf: 0.02 # 0.02 * (sdf_scale=25) = 0.5m
num_iters: 15000
warmup_steps: 2000
min_factor: 0.06
fglr: 1.0e-2
bglr: 1.0e-2
# skylr: 1.0e-3
emblr: 2.0e-2
image_embedding_dim: 4
start_it: 0
start_level: 2
stop_it: 4000
final_inv_s: 2400.
ctrl_start: 3000
lnini: 0.3 # !!! NOTE: A higher initial inv_s helps with disentanglement of cr/dv, especially for no mask settings
use_estimate_alpha: false
geo_init_method: pretrain_after_zero_out # pretrain
camera_list: [camera_FRONT]
# camera_list: [camera_SIDE_LEFT, camera_FRONT_LEFT, camera_FRONT, camera_FRONT_RIGHT, camera_SIDE_RIGHT]
lidar_list: [lidar_TOP]
lidar_weight: [0.1] # Will be normalized when using
#------------------------------------------------------------
#------------ Full configs
#------------------------------------------------------------
# exp_dir: logs/streetsurf_refactor/dbgfix4_nomask_withlidar_seg134763_${lidar_fn}=${w_lidar}_lnini=${lnini}_invs=${final_inv_s}_${ctrl_start}_sdfscale=${sdf_scale}_wsp=${w_sparsity}_for=${sparsity_anneal_for}_wlos=${w_los}_eps=${eps_los}_weik=${w_eikonal}_on=${on_render_type}_onocc=${on_occ_ratio}_a=${on_render_ratio}_stlv=${start_level}_ini262144_softplus_stop=${stop_it}_cl=${clw}_${clbeta}_${clearance_sdf}_ego2.0
# exp_parent_dir: logs/final_waymo_multiseq_exp4.36_withmask_withlidar_15k_cuboid_half_ext${extend_size}_${rgb_fn}_${lidar_fn}=${w_lidar}_med=${discard_median}_1_02_2_joint
exp_dir: logs/streetsurf/owndata_1
dataset_cfg:
target: dataio.autonomous_driving.WaymoDataset
param:
# root: /nvme/guojianfei/waymo/processed/
root: /home/tjh/Workspace/tjh/neuralsim/testData/
# root: /home/ventus/datasets/waymo/processed/
# root: ./data/waymo/processed/
rgb_dirname: images
lidar_dirname: lidars
mask_dirname: masks
scenebank_cfg:
# NOTE: scene_id[,start_frame[,n_frames]]
scenarios:
- hd24319, 0, 186
observer_cfgs:
Camera:
list: ${camera_list}
RaysLidar:
list: ${lidar_list}
on_load:
no_objects: true # Set to true to skip loading foreground objects into scene graph
joint_camlidar: true # !!! Convinient for NVS
align_orientation: true
consider_distortion: true
joint_camlidar_equivalent_extr: true
assetbank_cfg:
Street:
model_class: app.models.single.LoTDNeuSStreet
model_params:
dtype: half
var_ctrl_cfg:
ln_inv_s_init: ${lnini}
ln_inv_s_factor: 10.0
ctrl_type: mix_linear
start_it: ${ctrl_start}
stop_it: ${training.num_iters}
final_inv_s: ${final_inv_s}
cos_anneal_cfg: null
surface_cfg:
sdf_scale: ${sdf_scale}
encoding_cfg:
lotd_use_cuboid: true
lotd_auto_compute_cfg:
type: ngp
target_num_params: ${eval:"32*(2**20)"} # 64 MiB float16 params -> 32 Mi params
min_res: 16
n_feats: 2
log2_hashmap_size: 20
max_num_levels: null
param_init_cfg:
method: uniform_to_type
bound: 1.0e-4
anneal_cfg:
type: hardmask
start_it: ${start_it}
start_level: ${start_level} # (need to be small: so the training is stable; not too small, so there's still valid initialize pretraining.)
stop_it: ${stop_it} # Not for too much iters; should end very soon to not hinder quality
decoder_cfg:
type: mlp
D: 1
W: 64
# select_n_levels: 14
activation:
type: softplus
beta: 100.0
n_rgb_used_output: 0
geo_init_method: ${geo_init_method}
radiance_cfg:
use_pos: true
use_view_dirs: true
dir_embed_cfg:
type: spherical
degree: 4
D: 2
W: 64
n_appear_embedding: ${image_embedding_dim}
use_tcnn_backend: false
accel_cfg:
type: occ_grid
vox_size: 1.0
# resolution: [64,64,64]
occ_val_fn_cfg:
type: sdf
inv_s: 256.0 # => +- 0.01 sdf @ 0.3 thre
occ_thre: 0.3
ema_decay: 0.95
init_cfg:
mode: from_net
num_steps: 4
num_pts: ${eval:"2**20"}
acquire_from_net_cfg:
num_steps: 4
num_pts: ${eval:"2**20"}
acquire_from_samples_cfg: {}
n_steps_between_update: 16
n_steps_warmup: 256
ray_query_cfg:
query_mode: march_occ_multi_upsample_compressed
# query_mode: march_occ_multi_upsample
query_param:
nablas_has_grad: true
num_coarse: ${num_coarse}
num_fine: ${num_fine}
coarse_step_cfg:
step_mode: linear
march_cfg:
step_size: ${step_size} # Typical value: (far-near) / 4000
max_steps: 4096
upsample_inv_s: ${upsample_inv_s}
upsample_inv_s_factors: ${upsample_inv_s_factors}
upsample_use_estimate_alpha: ${use_estimate_alpha}
asset_params:
initialize_cfg:
target_shape: road_surface
obs_ref: camera_FRONT # Reference observer. Its trajectory will be used for initialization.
lr: 1.0e-3
num_iters: 1000
num_points: 262144
w_eikonal: 3.0e-3
floor_dim: z
floor_up_sign: 1
ego_height: 2.0
preload_cfg: {}
populate_cfg:
extend_size: ${extend_size}
Distant:
model_class: app.models.single.LoTDNeRFDistant
model_params:
dtype: half
encoding_cfg:
input_ch: 4
lotd_use_cuboid: true
lotd_auto_compute_cfg:
type: ngp4d
target_num_params: ${eval:"16*(2**20)"} # 16 Mi params
min_res_xyz: 16
min_res_w: 4
n_feats: 2
log2_hashmap_size: 19
per_level_scale: 1.382
param_init_cfg:
method: uniform_to_type
bound: 1.0e-4
# anneal_cfg:
# type: hardmask
# start_it: ${start_it}
# start_level: ${bg_start_level} # (need to be small: so the training is stable; not too small, so there's still valid initialize pretraining.)
# stop_it: ${stop_it} # Not for too much iters; should end very soon to not hinder quality
extra_pos_embed_cfg:
type: identity
sigma_decoder_cfg:
type: mlp
D: 1
W: 64
output_activation: softplus
radiance_decoder_cfg:
use_pos: false
# pos_embed_cfg:
# type: identity
use_view_dirs: false
# dir_embed_cfg:
# type: spherical
# degree: 4
use_nablas: false
D: 2
W: 64
n_appear_embedding: ${image_embedding_dim}
n_rgb_used_output: 0
use_tcnn_backend: false
include_inf_distance: true # !!! no sky
radius_scale_min: ${radius_scale_min}
radius_scale_max: ${radius_scale_max}
ray_query_cfg:
query_mode: march_occ
query_param:
march_cfg:
interval_type: ${distant_interval_type}
sample_mode: ${distant_mode}
max_steps: ${distant_nsample}
asset_params:
populate_cfg:
cr_obj_classname: Street
# Sky:
# model_class: app.models.env.SimpleSky
# model_params:
# dir_embed_cfg:
# type: sinusoidal
# n_frequencies: 10
# use_tcnn_backend: false
# D: 2
# W: 256
# use_tcnn_backend: false
# n_appear_embedding: ${image_embedding_dim}
ImageEmbeddings:
model_class: app.models.scene.ImageEmbeddings
model_params:
dims: ${image_embedding_dim}
weight_init: uniform
weight_init_std: 1.0e-4
#--- Pose refine related
LearnableParams:
model_class: app.models.scene.LearnableParams
model_params:
refine_ego_motion: true
# ego_node_id: ego_car
ego_class_name: Camera
refine_camera_intr: false
refine_camera_extr: false
enable_after: 500
renderer:
common:
with_env: false # !!! no sky
with_rgb: true
with_normal: true
near: ${near} # NOTE: Critical to scene scale!
far: ${far}
train:
depth_use_normalized_vw: false # For meaningful depth supervision (if any)
perturb: true
val:
depth_use_normalized_vw: true # For correct depth rendering
perturb: false
rayschunk: 4096
training:
#---------- Dataset and sampling
dataloader:
preload: true
preload_on_gpu: false
tags:
camera:
downscale: 1
list: ${camera_list}
# rgb_mask: {}
# rgb_human_mask: {}
# rgb_ignore_mask:
# ignore_not_occupied: false
# ignore_dynamic: false
# ignore_human: true
lidar:
list: ${lidar_list}
multi_lidar_merge: true
filter_when_preload: true
filter_kwargs:
filter_in_cams: true
pixel_dataset:
#---------- Frame and pixel dataloader
joint: false
equal_mode: ray_batch
num_rays: ${num_rays_pixel}
frame_sample_mode: uniform
pixel_sample_mode: error_map
error_map_res: [32,32]
uniform_sampling_fraction: 0.5
#---------- Joint frame-pixel dataloader
# joint: true
# equal_mode: ray_batch
# num_rays: ${num_rays_pixel}
# error_map_res: [32,32]
# uniform_sampling_fraction: 0.5
lidar_dataset:
equal_mode: ray_batch
num_rays: ${num_rays_lidar}
frame_sample_mode: uniform
lidar_sample_mode: merged_weighted
multi_lidar_weight: ${lidar_weight} # Will be normalized when used
val_dataloader:
preload: false
tags:
camera:
downscale: 4
list: ${camera_list}
# rgb_mask: {}
# rgb_human_mask: {}
# rgb_ignore_mask:
# ignore_not_occupied: false
# ignore_dynamic: false
# ignore_human: true
lidar: ${training.dataloader.tags.lidar}
image_dataset:
camera_sample_mode: all_list # !!!
frame_sample_mode: uniform
#---------- Training losses
uniform_sample: ${num_uniform}
losses:
rgb:
fn_type: ${rgb_fn}
fn_param: ${rgb_fn_param}
# respect_ignore_mask: true
# occupancy_mask:
# w: ${w_mask}
# w_on_errmap: 0
# safe_bce: true
# pred_clip: 0
mask_entropy:
w: 0.005
mode: crisp_cr
enable_after: 2000
anneal:
type: linear
start_it: 2000
stop_it: 5000
start_val: 0
stop_val: 0.005
update_every: 100
lidar:
discard_outliers: 0
discard_outliers_median: 100.0
discard_toofar: 80.0
depth:
w: ${w_lidar}
fn_type: ${lidar_fn}
fn_param: ${lidar_fn_param}
line_of_sight:
w: ${w_los}
fn_type: neus_unisim
fn_param:
# epsilon: ${eps_los}
epsilon_anneal:
type: milestones
milestones: [5000, 10000]
vals: [1.5, 0.75, 0.5]
eikonal:
safe_mse: ${safe_mse}
safe_mse_err_limit: ${errlim}
alpha_reg_zero: 0
on_occ_ratio: ${on_occ_ratio}
on_render_type: ${on_render_type}
on_render_ratio: ${on_render_ratio}
class_name_cfgs:
Street:
w: ${w_eikonal}
sparsity:
enable_after: ${sparsity_enable_after}
class_name_cfgs:
Street:
key: sdf
type: normalized_logistic_density
inv_scale: 16.0
w: ${w_sparsity}
anneal:
type: linear
start_it: ${sparsity_enable_after}
start_val: 0
stop_it: ${eval:"${sparsity_anneal_for}+${sparsity_enable_after}"}
stop_val: ${w_sparsity}
update_every: 100
clearance:
class_name_cfgs:
Street:
w: ${clw}
beta: ${clbeta}
thresh: ${clearance_sdf}
weight_reg:
class_name_cfgs:
Street:
norm_type: 2.0
w: 1.0e-6
Distant:
norm_type: 2.0
w: 1.0e-6
optim:
default: 1.0e-3
# Sky: ${skylr}
Distant:
lr: ${bglr}
eps: 1.0e-15
betas: [0.9, 0.99]
Street:
lr: ${fglr}
eps: 1.0e-15
betas: [0.9, 0.991]
invs_betas: [0.9, 0.999]
ImageEmbeddings: ${emblr}
#--- Pose refine related
LearnableParams:
ego_motion:
lr: 0.001
alpha_lr_rotation: 0.05
num_iters: ${num_iters}
scheduler:
#---------- exponential
type: exponential_step
num_iters: ${training.num_iters}
min_factor: ${min_factor}
warmup_steps: ${warmup_steps}
#---------- cosine
# type: warmupcosine
# num_iters: ${training.num_iters}
# min_factor: ${min_factor}
# warmup_steps: ${warmup_steps}
#---------- milestone
# type: multistep
# milestones: [20000, 30000]
# gamma: 0.33
#---------- Logging and validation
i_val: 1500 # unit: iters
i_backup: -1 # unit: iters
i_save: 900 # unit: seconds
i_log: 20
log_grad: false
log_param: false
ckpt_file: null
ckpt_ignore_keys: []
ckpt_only_use_keys: null
Hi,
Occupancy grids getting to all empty at the first training iter would likely due to the dataset being incorrectly configured. This can include many things, most likely is the world scaling or the extend_size
.
I have just updated a debug tool for this. In order to use it, you can pass --debug_scene=true
to the train.py
. Remember to git pull
and git submodule update --init --recursive
to update the repo first.
After hitting "play" and "pause" again, you will see a popped up window like below, showing the lidar points (colored pointclouds), extracted occupancy grids (grey voxels), the street's AABB box (the large bold green box), the street's local coordinate axis (the large RGB arrows attached to the corner of the large green box), the camera frustums (colored frustums which are moving if you hit "play" again)
https://github.com/PJLab-ADG/neuralsim/assets/25529198/fd0a374d-d2d9-4721-be84-8cab913701ad
You can check whether the AABB box is correctly created, whether ego_car and camers is above the occupancy grid surface, etc. You can also hit play again to check whether all the lidar frames are loaded correctly and whether the street's AABB box can contain all the lidar points (within the camera viewports).
Apart from above, you can also zoom in to check whether the camera views are correctly, e.g. if they are incorrectly upside-down etc.
https://github.com/PJLab-ADG/neuralsim/assets/25529198/d39562dc-971d-49e4-9c96-363a4287f429
thanks!
i found i input erroring pose in scenario.pt.
@ventusff Dear author, how to use the vis tools in remote server's docker?
when I add --debug_scene=true
in train.py, I get a coredump.
@ventusff Dear author, how to use the vis tools in remote server's docker? when I add
--debug_scene=true
in train.py, I get a coredump.
hello,could you pleasure the remote visulizaiton methed, Many thanks! I need it to debug my data. @ventusff
Thanks for your awesome work. I want to apply it to generate camera img and lidar databy my own street datasets. My own datasets have lidar data ,img and pose, i want generate diff param lidar data and camera img; which data format should i use,
Which modules and scripts do I need to use?