This is the code associated with the paper Predicting Semantic Map Representations from Images with Pyramid Occupancy Networks, published at CVPR 2020.
In our work we report results on two large-scale autonomous driving datasets: NuScenes and Argoverse. The birds-eye-view ground truth labels we use to train and evaluate our networks are generated by combining map information provided by the two datasets with 3D bounding box annotations, which we rasterise to produce a set of one-hot binary labels. We also make use of LiDAR point clouds to infer regions of the birds-eye-view which are completely occluded by buildings or other objects.
To train our method on NuScenes you will first need to
mono-semantic-maps
configs/datasets/nuscenes.yml
file, setting the dataroot
and label_root
entries to the location of the NuScenes dataset and the desired ground truth folder respectively.python scripts/make_nuscenes_labels.py
. Bewarned there's a lot of data so this will take a few hours to run! To train on the Argoverse dataset:
mono-semantic-maps
configs/datasets/argoverse.yml
file, setting the dataroot
and label_root
entries to the location of the install Argoverse data and the desired ground truth folder respectively.python scripts/make_argoverse_labels.py
. This script will also take a while to run! Once ground truth labels have been generated, you can train our method by running the train.py
script in the root directory:
python train.py --dataset nuscenes --model pyramid
The --dataset
flag allows you to specify the dataset to train on, either 'argoverse'
or 'nuscenes'
. The model flag allows training of the proposed method 'pyramid'
, or one of the baseline methods ('vpn'
or 'ved'
). Additional command line options can be specified by passing a list of key-value pairs to the --options
flag. The full list of configurable options can be found in the configs/defaults.yml
file.