vita-epfl / UniTraj

A Unified Framework for scalable Vehicle Trajectory Prediction, ECCV 2024
Other
231 stars 27 forks source link

Questions on Training Configuration and Evaluation Errors for MTR Model on unScenes Dataset #35

Open Platonight opened 2 weeks ago

Platonight commented 2 weeks ago

Dear @Alan-LanFeng,

Thank you for your excellent work! I have two questions I’d like to discuss with you:

  1. I encountered a similar issue with the MTR model not achieving the expected results on the unScenes dataset. From previous responses in the issues, I understand that I need to set two configurations:

Following this, I generated three sub-datasets using train, train_val, and val. Specifically:

data related

load_num_workers: 0 # number of workers for loading data train_data_path: ["/data/unScenes/result_train/"] # list of paths to the train data val_data_path: ["/data/unScenes/result_train_val/"] # list of paths to the train_val data

max_data_num: [-1] # maximum number of data for each training dataset past_len: 21 # history trajectory length, 2.1s future_len: 60 # future trajectory length, 6s object_type: ['VEHICLE'] # object types included in the training set line_type: ['lane', 'stop_sign', 'road_edge', 'road_line', 'crosswalk', 'speed_bump'] # line types to consider in input masked_attributes: ['z_axis', 'size'] # attributes to mask in input trajectory_sample_interval: 1 # trajectory sample interval only_train_on_ego: False # only train on AV center_offset_of_map: [30.0, 0.0] # map center offset use_cache: False # enable data loading cache overwrite_cache: False # overwrite cache if exists store_data_in_memory: False # store data in memory

official evaluation

nuscenes_dataroot: "/data/sets/nuscenes/" eval_nuscenes: False # evaluate with nuscenes evaluation tool eval_waymo: False # evaluate with waymo evaluation tool

defaults:

data related

load_num_workers: 0 # number of workers for loading data val_data_path: ["/data/unScenes/result_val/"] # list of paths to val data

max_data_num: [-1] # maximum number of data for each dataset past_len: 21 # history trajectory length, 2.1s future_len: 60 # future trajectory length, 6s object_type: ['VEHICLE'] # object types in training set line_type: ['lane', 'stop_sign', 'road_edge', 'road_line', 'crosswalk', 'speed_bump'] # line types in input masked_attributes: ['z_axis', 'size'] # attributes to mask in input trajectory_sample_interval: 1 # trajectory sample interval only_train_on_ego: False # only train on AV center_offset_of_map: [30.0, 0.0] # map center offset use_cache: False # data loading cache overwrite_cache: False # overwrite cache if exists store_data_in_memory: False # store data in memory

official evaluation

nuscenes_dataroot: "/data/sets/nuscenes/" eval_nuscenes: False # evaluate with nuscenes evaluation tool eval_waymo: False # evaluate with waymo evaluation tool

defaults:

# common
model_name: MTR

# model
CONTEXT_ENCODER:
  NAME: MTREncoder
  NUM_OF_ATTN_NEIGHBORS: 7
  NUM_INPUT_ATTR_AGENT: 39
  NUM_INPUT_ATTR_MAP: 29
  NUM_CHANNEL_IN_MLP_AGENT: 256
  NUM_CHANNEL_IN_MLP_MAP: 64
  NUM_LAYER_IN_MLP_AGENT: 3
  NUM_LAYER_IN_MLP_MAP: 5
  NUM_LAYER_IN_PRE_MLP_MAP: 3
  D_MODEL: 256
  NUM_ATTN_LAYERS: 6
  NUM_ATTN_HEAD: 8
  DROPOUT_OF_ATTN: 0.1
  USE_LOCAL_ATTN: True

MOTION_DECODER:
  NAME: MTRDecoder
  NUM_MOTION_MODES: 6
  INTENTION_POINTS_FILE: 'models/mtr/cluster_64_center_dict_6s.pkl'
  D_MODEL: 512
  NUM_DECODER_LAYERS: 6
  NUM_ATTN_HEAD: 8
  MAP_D_MODEL: 256
  DROPOUT_OF_ATTN: 0.1
  NUM_BASE_MAP_POLYLINES: 256
  NUM_WAYPOINT_MAP_POLYLINES: 128
  LOSS_WEIGHTS: {
    'cls': 1.0,
    'reg': 1.0,
    'vel': 0.5
  }

  NMS_DIST_THRESH: 2.5

# train
max_epochs: 60
learning_rate: 0.0001
learning_rate_sched: [ 22, 24, 26, 28 ]
optimizer: AdamW
scheduler: lambdaLR
grad_clip_norm: 1000.0
weight_decay: 0.01
lr_decay: 0.5
lr_clip: 0.000001
WEIGHT_DECAY: 0.01
train_batch_size: 48 #32 #128
eval_batch_size: 48 #32 #128

# data related
max_num_agents: 64
map_range: 100
max_num_roads: 768

# will be overwritten if manually_split_lane is True
max_points_per_lane: 20

manually_split_lane: True
point_sampled_interval: 1
num_points_each_polyline: 20
vector_break_dist_thresh: 1.0

However, I’m a bit confused by your previous response mentioning that “we have not touched the ‘train_val’ split.” What's more our MTR model is still not achieving the expected results in the paper. Could you clarify if there’s an issue with my current setup or are there any other adjustments I need to make?

  1. When attempting to use the official evaluation, I set eval_nuscenes to True and pointed nuscenes_dataroot to the raw unScenes data, but encountered an error:

File "/UniTraj/unitraj/models/base_model/base_model.py", line 157, in compute_official_evaluation 'instance': input_dict['scenario_id'][bsidx].split('')[1], IndexError: list index out of range

After inspection, I found the issue is due to scenario_id[0] returning only ['scene-0233'] without the expected additional parts for ‘instance’ (scenarioid.split('')[1]) and ‘sample’ (scenarioid.split('')[2]). It appears this stems from the unified data format used in the Unitraj batch dictionary. Could you advise if there’s an adjustment I might be missing here?

Thank you very much for your assistance!

Let me know if you need any further adjustments!

Alan-LanFeng commented 2 weeks ago

Hi, your configuration looks good.

nuscens has 3 splits: train, train_val, and val. We train MTR on train and validate on val. There are 32k samples in train and 9k samples in val, can you double check this?

ChengkaiYang commented 2 weeks ago

Hi, your configuration looks good.

nuscens has 3 splits: train, train_val, and val. We train MTR on train and validate on val. There are 32k samples in train and 9k samples in val, can you double check this?

Sorry, Alan.When I'm using Unitraj to preprocess nuscenes of 'train_val', it only has 1136 scenes, which only contains about 1136 trajectories(far away from 32k).I'm using default config file.Could you please help me what's the problem? Does 32k trajectories contain surrounding agents of focal agent in a scenario?I found we only train focal agent so if it's correct I only train 1136 trajectories for nuscenes dataset?Some experiments' result may confirme my suppose——(I got almost same brier-fde6 3.51 as paper).Looking forward for your reply!