Closed demmerichs closed 3 years ago
Hi, let me check the code. I may need to run the code to see what happens. Please give me a little bit time.
Also, below are the links to the pre-trained models. which might be helpful for you.
train_multi_seq.py
can be downloaded from Google Drive or Dropboxtrain_multi_seq_MGDA.py
can be downloaded from Google Drive or Dropbox@DavidS3141 , after quickly running the code, the output told me that for the scene 411 it generates 34 files, for scene 662 it generates 34 files, and for scene 2 it also generates 34 files, etc. So roughly in total we have 34 * 500 = 17000 files (close to the 17065). So I think the code is correct.
Could you run the code on your system to check how many files scene 411, 662, 2 would generate? (These 3 scenes are the first 3 scenes that will be processed by the code).
And you could comment out some of the proprocessing code, such as BEV rasterization and file saving, etc to accelerate the code running. In this way we may quickly check the total number of files it would dump (see the code below).
# COPYRIGHT (C) Mitsubishi Electric Research Labs (MERL) 2020
# Code written by Pengxiang Wu
# March 2020
from nuscenes.nuscenes import NuScenes
import os
from nuscenes.utils.data_classes import LidarPointCloud
import numpy as np
import argparse
from data.data_utils import voxelize_occupy, gen_2d_grid_gt
parser = argparse.ArgumentParser()
parser.add_argument('-r', '--root', default=None, type=str, help='Root path to nuScenes dataset')
parser.add_argument('-s', '--split', default='train', type=str, help='The data split [train/val/test]')
parser.add_argument('-p', '--savepath', default=None, type=str, help='Directory for saving the generated data')
args = parser.parse_args()
if args.root is None or args.savepath is None:
raise ValueError("Should specify the dataset path and the savepath.")
nusc = NuScenes(version='v1.0-trainval', dataroot=args.root, verbose=True)
print("Total number of scenes:", len(nusc.scene))
class_map = {'vehicle.car': 1, 'vehicle.bus.rigid': 1, 'vehicle.bus.bendy': 1, 'human.pedestrian': 2,
'vehicle.bicycle': 3} # background: 0, other: 4
if args.split == 'train':
num_keyframe_skipped = 0 # The number of keyframes we will skip when dumping the data
nsweeps_back = 30 # Number of frames back to the history (including the current timestamp)
nsweeps_forward = 20 # Number of frames into the future (does not include the current timestamp)
skip_frame = 0 # The number of frames skipped for the adjacent sequence
num_adj_seqs = 2 # number of adjacent sequences, among which the time gap is \delta t
else:
num_keyframe_skipped = 1
nsweeps_back = 25 # Setting this to 30 (for training) or 25 (for testing) allows conducting ablation studies on frame numbers
nsweeps_forward = 20
skip_frame = 0
num_adj_seqs = 1
# The specifications for BEV maps
voxel_size = (0.25, 0.25, 0.4)
area_extents = np.array([[-32., 32.], [-32., 32.], [-3., 2.]])
past_frame_skip = 3 # when generating the BEV maps, how many history frames need to be skipped
future_frame_skip = 0 # when generating the BEV maps, how many future frames need to be skipped
num_past_frames_for_bev_seq = 5 # the number of past frames for BEV map sequence
scenes = np.load('data/split.npy', allow_pickle=True).item().get(args.split)
print("Split: {}, which contains {} scenes.".format(args.split, len(scenes)))
# ---------------------- Extract the scenes, and then pre-process them into BEV maps ----------------------
def gen_data():
res_scenes = list()
for s in scenes:
s_id = s.split('_')[1]
res_scenes.append(int(s_id))
total = 0
for scene_idx in res_scenes:
curr_scene = nusc.scene[scene_idx]
first_sample_token = curr_scene['first_sample_token']
curr_sample = nusc.get('sample', first_sample_token)
curr_sample_data = nusc.get('sample_data', curr_sample['data']['LIDAR_TOP'])
save_data_dict_list = list() # for storing consecutive sequences; the data consists of timestamps, points, etc
save_box_dict_list = list() # for storing box annotations in consecutive sequences
save_instance_token_list = list()
adj_seq_cnt = 0
save_seq_cnt = 0 # only used for save data file name
# Iterate each sample data
print("Processing scene {} ...".format(scene_idx))
while curr_sample_data['next'] != '':
# Get the synchronized point clouds
all_pc, all_times, trans_matrices = \
LidarPointCloud.from_file_multisweep_bf_sample_data(nusc, curr_sample_data,
return_trans_matrix=True,
nsweeps_back=nsweeps_back,
nsweeps_forward=nsweeps_forward)
# Store point cloud of each sweep
pc = all_pc.points
_, sort_idx = np.unique(all_times, return_index=True)
unique_times = all_times[np.sort(sort_idx)] # Preserve the item order in unique_times
num_sweeps = len(unique_times)
# Make sure we have sufficient past and future sweeps
if num_sweeps != (nsweeps_back + nsweeps_forward):
# Skip some keyframes if necessary
flag = False
for _ in range(num_keyframe_skipped + 1):
if curr_sample['next'] != '':
curr_sample = nusc.get('sample', curr_sample['next'])
else:
flag = True
break
if flag: # No more keyframes
break
else:
curr_sample_data = nusc.get('sample_data', curr_sample['data']['LIDAR_TOP'])
# Reset
adj_seq_cnt = 0
save_data_dict_list = list()
save_box_dict_list = list()
save_instance_token_list = list()
continue
adj_seq_cnt += 1
if adj_seq_cnt == num_adj_seqs:
print(">> Finish sample Num: {}".format(total + 1))
total += 1
# --------------------------------------------------------------------------------
save_seq_cnt += 1
adj_seq_cnt = 0
save_data_dict_list = list()
save_box_dict_list = list()
save_instance_token_list = list()
# Skip some keyframes if necessary
flag = False
for _ in range(num_keyframe_skipped + 1):
if curr_sample['next'] != '':
curr_sample = nusc.get('sample', curr_sample['next'])
else:
flag = True
break
if flag: # No more keyframes
break
else:
curr_sample_data = nusc.get('sample_data', curr_sample['data']['LIDAR_TOP'])
else:
flag = False
for _ in range(skip_frame + 1):
if curr_sample_data['next'] != '':
curr_sample_data = nusc.get('sample_data', curr_sample_data['next'])
else:
flag = True
break
if flag: # No more sample frames
break
# ---------------------- Convert the raw data into (dense) BEV maps ----------------------
def convert_to_dense_bev(data_dict):
num_sweeps = data_dict['num_sweeps']
times = data_dict['times']
trans_matrices = data_dict['trans_matrices']
num_past_sweeps = len(np.where(times >= 0)[0])
num_future_sweeps = len(np.where(times < 0)[0])
assert num_past_sweeps + num_future_sweeps == num_sweeps, "The number of sweeps is incorrect!"
# Load point cloud
pc_list = []
for i in range(num_sweeps):
pc = data_dict['pc_' + str(i)]
pc_list.append(pc.T)
# Reorder the pc, and skip sample frames if wanted
# Currently the past frames in pc_list are stored in the following order [current, current + 1, current + 2, ...]
# Therefore, we would like to reorder the frames
tmp_pc_list_1 = pc_list[0:num_past_sweeps:(past_frame_skip + 1)]
tmp_pc_list_1 = tmp_pc_list_1[::-1]
tmp_pc_list_2 = pc_list[(num_past_sweeps + future_frame_skip)::(future_frame_skip + 1)]
pc_list = tmp_pc_list_1 + tmp_pc_list_2 # now the order is: [past frames -> current frame -> future frames]
num_past_pcs = len(tmp_pc_list_1)
num_future_pcs = len(tmp_pc_list_2)
# Discretize the input point clouds, and compute the ground-truth displacement vectors
# The following two variables contain the information for the
# compact representation of binary voxels, as described in the paper
voxel_indices_list = list()
padded_voxel_points_list = list()
past_pcs_idx = list(range(num_past_pcs))
past_pcs_idx = past_pcs_idx[-num_past_frames_for_bev_seq:] # we typically use 5 past frames (including the current one)
for i in past_pcs_idx:
res, voxel_indices = voxelize_occupy(pc_list[i], voxel_size=voxel_size, extents=area_extents, return_indices=True)
voxel_indices_list.append(voxel_indices)
padded_voxel_points_list.append(res)
# Compile the batch of voxels, so that they can be fed into the network.
# Note that, the padded_voxel_points in this script will only be used for sanity check.
padded_voxel_points = np.stack(padded_voxel_points_list, axis=0).astype(np.bool)
# Finally, generate the ground-truth displacement field
# - all_disp_field_gt: the ground-truth displacement vectors for each grid cell
# - all_valid_pixel_maps: the masking map for valid pixels, used for loss computation
# - non_empty_map: the mask which represents the non-empty grid cells, used for loss computation
# - pixel_cat_map: the map specifying the category for each non-empty grid cell
# - pixel_indices: the indices of non-empty grid cells, used to generate sparse BEV maps
# - pixel_instance_map: the map specifying the instance id for each grid cell, used for loss computation
all_disp_field_gt, all_valid_pixel_maps, non_empty_map, pixel_cat_map, pixel_indices, pixel_instance_map \
= gen_2d_grid_gt(data_dict, grid_size=voxel_size[0:2], extents=area_extents,
frame_skip=future_frame_skip, return_instance_map=True)
return voxel_indices_list, padded_voxel_points, pixel_indices, pixel_instance_map, all_disp_field_gt,\
all_valid_pixel_maps, non_empty_map, pixel_cat_map, num_past_frames_for_bev_seq, num_future_pcs, trans_matrices
# ---------------------- Convert the dense BEV data into sparse format ----------------------
# This will significantly reduce the space used for data storage
def convert_to_sparse_bev(dense_bev_data):
save_voxel_indices_list, save_voxel_points, save_pixel_indices, save_pixel_instance_maps, \
save_disp_field_gt, save_valid_pixel_maps, save_non_empty_maps, save_pixel_cat_maps, \
save_num_past_pcs, save_num_future_pcs, save_trans_matrices = dense_bev_data
save_valid_pixel_maps = save_valid_pixel_maps.astype(np.bool)
save_voxel_dims = save_voxel_points.shape[1:]
num_categories = save_pixel_cat_maps.shape[-1]
sparse_disp_field_gt = save_disp_field_gt[:, save_pixel_indices[:, 0], save_pixel_indices[:, 1], :]
sparse_valid_pixel_maps = save_valid_pixel_maps[:, save_pixel_indices[:, 0], save_pixel_indices[:, 1]]
sparse_pixel_cat_maps = save_pixel_cat_maps[save_pixel_indices[:, 0], save_pixel_indices[:, 1]]
sparse_pixel_instance_maps = save_pixel_instance_maps[save_pixel_indices[:, 0], save_pixel_indices[:, 1]]
save_data_dict = dict()
for i in range(len(save_voxel_indices_list)):
save_data_dict['voxel_indices_' + str(i)] = save_voxel_indices_list[i].astype(np.int32)
save_data_dict['disp_field'] = sparse_disp_field_gt
save_data_dict['valid_pixel_map'] = sparse_valid_pixel_maps
save_data_dict['pixel_cat_map'] = sparse_pixel_cat_maps
save_data_dict['num_past_pcs'] = save_num_past_pcs
save_data_dict['num_future_pcs'] = save_num_future_pcs
save_data_dict['trans_matrices'] = save_trans_matrices
save_data_dict['3d_dimension'] = save_voxel_dims
save_data_dict['pixel_indices'] = save_pixel_indices
save_data_dict['pixel_instance_ids'] = sparse_pixel_instance_maps
# -------------------------------- Sanity Check --------------------------------
dims = save_non_empty_maps.shape
test_disp_field_gt = np.zeros((save_num_future_pcs, dims[0], dims[1], 2), dtype=np.float32)
test_disp_field_gt[:, save_pixel_indices[:, 0], save_pixel_indices[:, 1], :] = sparse_disp_field_gt[:]
assert np.all(test_disp_field_gt == save_disp_field_gt), "Error: Mismatch"
test_valid_pixel_maps = np.zeros((save_num_future_pcs, dims[0], dims[1]), dtype=np.bool)
test_valid_pixel_maps[:, save_pixel_indices[:, 0], save_pixel_indices[:, 1]] = sparse_valid_pixel_maps[:]
assert np.all(test_valid_pixel_maps == save_valid_pixel_maps), "Error: Mismatch"
test_pixel_cat_maps = np.zeros((dims[0], dims[1], num_categories), dtype=np.float32)
test_pixel_cat_maps[save_pixel_indices[:, 0], save_pixel_indices[:, 1], :] = sparse_pixel_cat_maps[:]
assert np.all(test_pixel_cat_maps == save_pixel_cat_maps), "Error: Mismatch"
test_non_empty_map = np.zeros((dims[0], dims[1]), dtype=np.float32)
test_non_empty_map[save_pixel_indices[:, 0], save_pixel_indices[:, 1]] = 1.0
assert np.all(test_non_empty_map == save_non_empty_maps), "Error: Mismatch"
test_pixel_instance_map = np.zeros((dims[0], dims[1]), dtype=np.uint8)
test_pixel_instance_map[save_pixel_indices[:, 0], save_pixel_indices[:, 1]] = sparse_pixel_instance_maps[:]
assert np.all(test_pixel_instance_map == save_pixel_instance_maps), "Error: Mismatch"
for i in range(len(save_voxel_indices_list)):
indices = save_data_dict['voxel_indices_' + str(i)]
curr_voxels = np.zeros(save_voxel_dims, dtype=np.bool)
curr_voxels[indices[:, 0], indices[:, 1], indices[:, 2]] = 1
assert np.all(curr_voxels == save_voxel_points[i]), "Error: Mismatch"
return save_data_dict
if __name__ == "__main__":
gen_data()
When I run my version of the script I got the following, which looks a bit different then yours because of your changes:
Processing scene 411 ...
>> Finish sample: 0, sequence 0
>> Finish sample: 0, sequence 1
>> Finish sample: 1, sequence 0
>> Finish sample: 1, sequence 1
>> Finish sample: 2, sequence 0
>> Finish sample: 2, sequence 1
>> Finish sample: 3, sequence 0
>> Finish sample: 3, sequence 1
>> Finish sample: 4, sequence 0
>> Finish sample: 4, sequence 1
>> Finish sample: 5, sequence 0
>> Finish sample: 5, sequence 1
>> Finish sample: 6, sequence 0
>> Finish sample: 6, sequence 1
>> Finish sample: 7, sequence 0
>> Finish sample: 7, sequence 1
>> Finish sample: 8, sequence 0
>> Finish sample: 8, sequence 1
>> Finish sample: 9, sequence 0
>> Finish sample: 9, sequence 1
>> Finish sample: 10, sequence 0
>> Finish sample: 10, sequence 1
>> Finish sample: 11, sequence 0
>> Finish sample: 11, sequence 1
>> Finish sample: 12, sequence 0
>> Finish sample: 12, sequence 1
>> Finish sample: 13, sequence 0
>> Finish sample: 13, sequence 1
>> Finish sample: 14, sequence 0
>> Finish sample: 14, sequence 1
>> Finish sample: 15, sequence 0
>> Finish sample: 15, sequence 1
>> Finish sample: 16, sequence 0
>> Finish sample: 16, sequence 1
>> Finish sample: 17, sequence 0
>> Finish sample: 17, sequence 1
>> Finish sample: 18, sequence 0
>> Finish sample: 18, sequence 1
>> Finish sample: 19, sequence 0
>> Finish sample: 19, sequence 1
>> Finish sample: 20, sequence 0
>> Finish sample: 20, sequence 1
>> Finish sample: 21, sequence 0
>> Finish sample: 21, sequence 1
>> Finish sample: 22, sequence 0
>> Finish sample: 22, sequence 1
>> Finish sample: 23, sequence 0
>> Finish sample: 23, sequence 1
>> Finish sample: 24, sequence 0
>> Finish sample: 24, sequence 1
>> Finish sample: 25, sequence 0
>> Finish sample: 25, sequence 1
>> Finish sample: 26, sequence 0
>> Finish sample: 26, sequence 1
>> Finish sample: 27, sequence 0
>> Finish sample: 27, sequence 1
>> Finish sample: 28, sequence 0
>> Finish sample: 28, sequence 1
>> Finish sample: 29, sequence 0
>> Finish sample: 29, sequence 1
>> Finish sample: 30, sequence 0
>> Finish sample: 30, sequence 1
>> Finish sample: 31, sequence 0
>> Finish sample: 31, sequence 1
>> Finish sample: 32, sequence 0
>> Finish sample: 32, sequence 1
>> Finish sample: 33, sequence 0
>> Finish sample: 33, sequence 1
Processing scene 662 ...
The first four scenes all have 34 samples, and are 411, 662, 225, 2 in that order (I think you just missed 225, because also your script provided here gives this order of scenes). Sadly for me the loading is also quite slow, but I am running your shortened script to find out the total count of samples. But right now everything points to some error happening during the preprocessing which I missed and which resulted in the drop of scenes. I just realized "again" that nuscenes is quite memory hungry and I also had some other applications running and started automatically also val and test preprocessing, so maybe the generation script was killed because of OOM. Right now stopped other programs and will update this as soon as the scripts have finished (might take a day :/, tqdm would have been nice for this).
Yes. The data loading is slow since the data reader of nuScenes is implemented in Python instead of C++.
Let me know if you successfully generate 17065 files.
It was a problem on my side, thanks for your help again. I now processed all scenes as expected. Closing this.
First of all, thank you for releasing your code and your great work. I have a short question regarding your MotionNet, as I am trying to reproduce your numbers. When I run the pre-processing script over the NuScenes folder everything seems to work fine and the output looks also good, with a rough training dataset size of 19GB. You reported in your data/readme.md a total preprocessed training dataset size of 26,5 GB on your system. Is this difference realistic? Also, when I start the MGDA with ST consistency loss as shown in the readme.md, the first warning I get is "The size of training dataset is not 17065" and shortly after I am told that my Training dataset size is instead 6951. So a lot of numbers do not add up for me here (if you have 17k samples in 26GB and I less than half of samples still in 19GB, and also where are the missing 10k samples). Maybe you can help me out here or have an idea of what is different?
Everything I did was done in a
venv
with python3.6.9 and the required pip-dependencies on an Ubuntu 18.04 system. The command line I ran was directly taken from thereadme.md
where I just replaced my used directories:The starting output looks like the following:
When I now start a training with MGDA and ST consistency loss like described in the
readme.md
:I get the following output:
So as you can see, there is no real problem with the preprocessing and the start of the training, however having 10k samples missing compared to the publicated results makes the reproduction of the results impossible.
Also after some time the training actually fails, but I cannot tell if it is related to this issue (I am not a pickle expert):