yuantianyuan01 / StreamMapNet

GNU General Public License v3.0
189 stars 16 forks source link

Actual iter to start sequential sampling training? #32

Open Song-Jingyu opened 2 months ago

Song-Jingyu commented 2 months ago

Hi,

Thanks for this amazing work! When I was trying to re-run experiments with nuScenes newsplit using 8 GPUs with batch size 4, I realized the sequential data actually starts from iter ~4200, which is different from what we set in config (should be 3480) in InfiniteGroupEachSampleInBatchSampler. May I get some suggestion on why this inconsistency between actual training behavior and the configuration? Thanks!

yuantianyuan01 commented 2 months ago

Hi, thank you for your interest in our project.

Can you provide your training log? I did not encountered this problem.

Song-Jingyu commented 2 months ago

Hi,

I attached the training log. The way I interpreted temporal training is from trans_loss. From what I noticed is there isn't sequential frames being loaded until ~4200 iters (basically they are all first frames). Thanks for your help in looking into this! output.log

Best, Jingyu

yuantianyuan01 commented 2 months ago

Thank you for reporting the issue. I found it turns out to be a bug that may randomly delay the sequential sampling up to one epoch. To solve the issue, we can replace the line https://github.com/yuantianyuan01/StreamMapNet/blob/000ce22f2fae6a1798a57471f820b8232ae76a74/plugin/datasets/samplers/group_sampler.py#L274 by yield [sample_ids[shuffled[0]]].

Song-Jingyu commented 2 months ago

Thanks for checking this issue! May I ask if the reported results in the paper come from experiments have this bug or not?

yuantianyuan01 commented 2 months ago

The reported results do include the bug, but its impact should be minimal as it only delays the sequential sampling by up to one epoch.