swc-17 / SparseDrive

SparseDrive: End-to-End Autonomous Driving via Sparse Scene Representation
MIT License
332 stars 37 forks source link

training with only map head failed #32

Closed ASONG0506 closed 1 month ago

ASONG0506 commented 1 month ago

Hello, during my training process, I only set up the training for the map head module and found that the loss value stopped decreasing after dropping to around 6. I am using 4*3090 GPUs with a batch size of 64 and an initial learning rate (lr_init) of 4e-4. Attached is my training log. Could you help me identify the reason for this? 20240823_155044.log

after that , I set lr_init=1e-4 instead and here is the result, same result as above 20240826_183830.log

and this is my clustered map anchor and det anchor visualizing reuslt, is that normal? map_anchor_100 det_anchor_900

swc-17 commented 1 month ago

The visualization results are normal. Can you give a detailed description of your modification on the config file and codes?

ASONG0506 commented 1 month ago

@swc-17 I tried to train using stage1 config file, only using 1 stage instead of 2, because I wanted to train and validate the perception module firstly. I didn't change the code actually, and I only made modifications to the following part of the stage1 config file: task_config = dict( with_det=True, with_map=True, with_motion_plan=False, ) Changed it to: task_config = dict( with_det=False, with_map=True, with_motion_plan=False, ) This means I modified the with_det option, disabling the detection training part. The learning rate was changed from 4e-4 to 1e-4. Other aspects remained largely unchanged.

ASONG0506 commented 1 month ago

When I turn on the training of obj detection, the loss of mapping module looks like normal

swc-17 commented 1 month ago

This is interesting, we previously trained the map-only model and it can converge normally.

ASONG0506 commented 1 month ago

@swc-17 I add some auxiliary loss, now it looks like the training convergence is normal