microsoft / VideoX

VideoX: a collection of video cross-modal models
Other
967 stars 160 forks source link

About the performance of UAV123 in SeqTrack. #91

Closed Tchuanm closed 1 year ago

Tchuanm commented 1 year ago

Hi, Xin, thanks for your novel work. With following the same setting with your repo, I evaluated the performance of SeqTrack-B256 on UAV123, as below:

uav | AUC | OP50 | OP75 | Precision | Norm Precision | seqtrack_b256 | 68.62 | 83.98 | 62.59 | 89.40 | 84.25 |

There is a little difference (lower ~0.5%) compared with evaluated your raw_result fold. uav | AUC | OP50 | OP75 | Precision | Norm Precision | seqtrack_b256 | 69.15 | 84.68 | 63.06 | 89.98 | 84.78 |

I have not change anything after clone the repo, do you have some idea about the phenomena.

chenxin-dlut commented 1 year ago

We found the same matrix multiplication may have slight difference on different devices, leading to the little difference in scores.

This difference is usually invisible on a common task, but the difference accumulates more in sequence modeling (since the prediction is causal), resulting in a slight difference in results.

Tchuanm commented 1 year ago

Thanks for your response. Another interested thing is about the training time. Could you share me around how many hours (or days) of training (i.e., 4 datasets with SeqTrack-B256), which help me to estimate my develop cycle.