Closed BIT-MJY closed 2 years ago
Hi @BIT-MJY ,
We used a short sequence length of 5 (L) for most experiments with weight window of size 3 (w). Larger w (for a larger L) will increase the model parameters which might lead to poor performance if your dataset is not that large. The inverse might also hold for a larger dataset with smaller network, possibly explaining your findings.
Since you mentioned a LiDaR use case, it could also be due to specifc characteristics of this modality as compared to RGB images. This might be of some help.
Significant distribution differences between the training and test set (e.g. due to different environments or condition types) affect performance (see Section V-D).
Performance is indeed dependent on the underlying single-frame input descriptors but would likely have a proportional effect, for example, see Table 3 on Page 10, where both NetVLAD and GMP are benefitted roughly similarly with SeqNet.
I will read this paper carefully. Thank you very much :)
Hello @oravus ,
Thanks for your fantastic job providing a nice way to fuse information from multiple frames. Now I only use SeqNet (Conv+SAP+L2-Norm) without SeqMatch, trained with my own dataset for LIDAR-based place recognition. However, I found SeqNet did not work well once the seq_len is small (< 20) and was hard to train to get a strong model. Thus I guess that whether SeqNet without SeqMatch works well largely depends on the distribution of training set and test set, or it largely depends on the output module of the raw descriptor generation algorithm. Is this right? Have you met some dataset where SeqNet can not work well?
Best wishes!