yuantianyuan01 / StreamMapNet

GNU General Public License v3.0
186 stars 16 forks source link

Details about the training #28

Open lebron-2016 opened 2 months ago

lebron-2016 commented 2 months ago

Hi,

Could you please elaborate on the details of how you train the network? For example, how many time steps does a batch size contain? Do you cut a full video into segments and train them sequentially?

In addition, I noticed that you process each batch size separately in the stream_fusion_neck part. What will happen if multiple batches are processed in parallel?

Thanks.

yuantianyuan01 commented 2 months ago

Hi, thanks for your interest in our work!

For your first question, a batch contains multiple samples, where each sample represents a single frame of a video. Each video is randomly split into two segments, and each segment is trained sequentially. For example, if the batch size is 2, the batches would look like this:

[1st frame of video 1, 1st frame of video 2],
[2nd frame of video 1, 2nd frame of video 2],
...
[10th frame of video 1, 10th frame of video 2],
[1st frame of video 3, 11th frame of video 2], (here video 1 is finished and another segment come in).
...

Therefore, for your second question, we need to check whether each sample in the batch is the first frame of a new segment (to refresh the hidden states), which means we cannot process them in parallel.

lebron-2016 commented 2 months ago

Hi, thanks for your interest in our work!

For your first question, a batch contains multiple samples, where each sample represents a single frame of a video. Each video is randomly split into two segments, and each segment is trained sequentially. For example, if the batch size is 2, the batches would look like this:

[1st frame of video 1, 1st frame of video 2],
[2nd frame of video 1, 2nd frame of video 2],
...
[10th frame of video 1, 10th frame of video 2],
[1st frame of video 3, 11th frame of video 2], (here video 1 is finished and another segment come in).
...

Therefore, for your second question, we need to check whether each sample in the batch is the first frame of a new segment (to refresh the hidden states), which means we cannot process them in parallel.

Thanks for your quick reply!! And now I got it.

I would also like to learn the code on data preprocessing and loading. Could you provide me with the relevant link?

Thanks a lot!!

yuantianyuan01 commented 2 months ago

The most important part is the batch sampler in https://github.com/yuantianyuan01/StreamMapNet/blob/main/plugin/datasets/samplers/group_sampler.py#L178.

lebron-2016 commented 2 months ago

The most important part is the batch sampler in https://github.com/yuantianyuan01/StreamMapNet/blob/main/plugin/datasets/samplers/group_sampler.py#L178.

Got it!! Thanks a lot.

If I split the sequences into fixed lengths without any variable length, will it affect the training results?