open-mmlab / mmtracking

OpenMMLab Video Perception Toolbox. It supports Video Object Detection (VID), Multiple Object Tracking (MOT), Single Object Tracking (SOT), Video Instance Segmentation (VIS) with a unified framework.
https://mmtracking.readthedocs.io/en/latest/
Apache License 2.0
3.53k stars 591 forks source link

Parameters and variables setting in DFF model #653

Closed yan811 closed 2 years ago

yan811 commented 2 years ago

In "/mmtracking/mmtrack/models/motion/flownet_simple.py," the init parameters "flow_img_norm_std=[255.0, 255.0, 255.0]" and "flow_img_norm_mean=[0.411, 0.432, 0.450]" . What's the meaning of these parameters? I'm using a type of data with 10 channels, how should I set these parameters?

Also, in "prepare_imgs" method, "img_metas[0]['img_norm_cfg']['mean']" and "img_metas[0]['img_norm_cfg']['std']" are both initialized with 0.Is it necessary to reassign the value while training or testing? If necessary, how and what value should I assign to these variables?

dyhBUPT commented 2 years ago

Hi, the std and mean are used to nromalize the input img.

About how to set these parameters, it depends on what the 10 channels of your data are.

yan811 commented 2 years ago

I got it. Is there any API to calculate the std and mean of my dataset (cocovid format)?

dyhBUPT commented 2 years ago

I'm sorry but I don't know about it.

yan811 commented 2 years ago

All right. Thanks.

yan811 commented 2 years ago

I am not clear yet. In flownet_simple.py, there are two types of std/mean: flow_img_norm_std/flow_img_norm_mean; img_metas['img_norm_cfg']['mean']/img_metas['img_norm_cfg']['std']. I think the latter ones refer to the input dataset. But I'm still wondering what the previous ones refer to.

dyhBUPT commented 2 years ago

Hi, the flownet has its own mean and std for normalization, and the effect of img_nrom_cfg is cancelled out. For more details, you can read these codes: https://github.com/open-mmlab/mmtracking/blob/bcc0c9b3a3062c770457989e240cb55c3b667875/mmtrack/models/motion/flownet_simple.py#L183-L191 I hope this will help you.

Best wishes.

yan811 commented 2 years ago

Normalization is usually calculated as: "(img-img_mean)/img_std", but in line 183-184, the data is transfered with "img/std-mean", which is not the same form as normalization. Is there any other meaning of this transformation?

Also in these codes, self.flow_img_norm_std=[255.0, 255.0, 255.0], self.flow_img_norm_mean=[0.411, 0.432, 0.450]. They both have 3 channels. But my 10-channel data requires 10-channel flow_img_norm_std and 10-channel flow_img_norm_mean. How to set these params?

dyhBUPT commented 2 years ago

line 183 img * std + mean is used to cancelled out the effect of (img-mean)/std. line 184 img / std - mean is ok, because the mean is in [0, 1], not [0, 255]. In summary, there is a simple mathematical manipulation.

For your 10-channel data, maybe you can calculate it's statistics in each channel (i.e., mean and std) and set them as the nrom params.

yan811 commented 2 years ago

I think my data's statistics are supposed to set the self.img_norm_std and self.img_norm_mean. But as mentioned before, flow_img_norm_std and flow_img_normmean are the statistics of flownet. How to set the flow..._params to 10-channel?

dyhBUPT commented 2 years ago

I give an example, for one channel, I suppose that mean=100, std=5. Then you can set as:

img_nrom_mean = 100
img_nrom_std = 5
flow_img_norm_mean = 20  # i.e., 100 / 5
flow_img_norm_std = 5

Not that I'm not sure if this works. You can study it in detail.

yan811 commented 2 years ago

In your example, img_norm_mean=mean,img_norm_std=std, flow_img_norm_std=img_norm_std=std, flow_img_norm_mean=img_norm_mean/img_norm_std Is that right?

dyhBUPT commented 2 years ago

Yes, but please refer with caution. because I'm not so confident for your case.

yan811 commented 2 years ago

I will try it. Thanks.