cvg / glue-factory

Training library for local feature detection and matching
Apache License 2.0
775 stars 99 forks source link

A failed attempt to use official PTH as the initial training value #49

Open Zhaoyibinn opened 11 months ago

Zhaoyibinn commented 11 months ago

Hello, first of all, thank you for your open source training, which is very important for many of us developers.

At present, I want to use my personal small dataset for fine-tuning under the original official weights. My modification method is to change the reading of official weights around line 296 in the gluefactory/tarin.py file, but during training, it shows a very large loss. if init_cp is not None: local_path = '~/3DRE/sfm-learn/SFM_OWN/feature/LightGlue/wight/superpoint_lightglue_v0-1_arxiv.pth' # read offical pth state_dict = torch.load(local_path, map_location='cpu')
model.load_state_dict(state_dict, strict=False)
# model.load_state_dict(init_cp["model"], strict=False)

After several epochs of training, the loss rapidly decreases, but this is followed by a rapid decrease in matching points. After debugging, it was found that the mscore has severely decreased, and only by adjusting the parameters filter_threshold to 0.001 or even lower can matching points be seen (resulting in a decrease in matching accuracy).

Besides, I also noticed that the names and quantities of keys for official PTH and TAR trained through official tutorials are different, and I am not sure if this is one of the reasons.

May I ask if the above issue is due to the official PTH not being able to directly read weights and train, or a change in the loss construction method, or some other reasons?

Zhaoyibinn commented 11 months ago

I have solved this problem, and the core issue is:

The official PTH, TAR trained through official tutorials, and the structure of network inference. The names of the keys for the three are different and need to be converted when reading.

The official conversion method has been written in the Lightglue library, located at approximately line 470 of Lightglue. py.

for i in range(self.conf.n_layers): pattern = f"self_attn.{i}", f"transformers.{i}.self_attn state_dict = {k.replace(*pattern): v for k, v in state_dict.items()} pattern = f"cross_attn.{i}", f"transformers.{i}.cross_attn" state_dict = {k.replace(*pattern): v for k, v in state_dict.items()}

I referred to this writing method and manually checked the key names for each link.

ly0224 commented 10 months ago

I have solved this problem, and the core issue is:我已经解决了这个问题,核心问题是:

The official PTH, TAR trained through official tutorials, and the structure of network inference. The names of the keys for the three are different and need to be converted when reading.官方PTH,通过官方教程训练的TAR,以及网络推理的结构。三者的键名不同,读取时需要转换。

The official conversion method has been written in the Lightglue library, located at approximately line 470 of Lightglue. py.官方的转换方法已经写在 Lightglue 库中,位于 Lightglue 的大约 470 行。py。

for i in range(self.conf.n_layers): pattern = f"self_attn.{i}", f"transformers.{i}.self_attn state_dict = {k.replace(*pattern): v for k, v in state_dict.items()} pattern = f"cross_attn.{i}", f"transformers.{i}.cross_attn" state_dict = {k.replace(*pattern): v for k, v in state_dict.items()}

I referred to this writing method and manually checked the key names for each link.我参考了这种编写方法,并手动检查了每个链接的键名。

Hello, I want to fine-tune the training on my own dataset, how should I do it, what the format of the dataset should be

Zhaoyibinn commented 10 months ago

我已经解决了这个问题,核心问题是:我已经解决了这个问题,核心问题是: 官方PTH,通过官方教程训练的TAR,以及网络推理的结构。官方PTH,通过官方教程训练的TAR,以及网络推理的结构。三者的键名不同,读取时需要转换。 官方的转换方法已经写在 Lightglue 库中,位于 Lightglue 的大约 470 行。py.官方的转换方法已经写在 Lightglue 库中,位于 Lightglue 的大约 470 行。py。 for i in range(self.conf.n_layers): pattern = f"self_attn.{i}", f"transformers.{i}.self_attn state_dict = {k.replace(*pattern): v for k, v in state_dict.items()} pattern = f"cross_attn.{i}", f"transformers.{i}.cross_attn" state_dict = {k.replace(*pattern): v for k, v in state_dict.items()} 我参考了这种编写方法,并手动检查了每个链接的键名。

您好,我想在自己的数据集上微调训练,我应该怎么做,数据集的格式应该是什么

您可以下载官方所采用的Homography或者megadepth数据集进行参考(虽然经过我实际测试微调效果并不好)

ly0224 commented 10 months ago

我已经解决了这个问题,核心问题是:我已经解决了这个问题,核心问题是: 官方PTH,通过官方教程训练的TAR,以及网络推理的结构。官方PTH,通过官方教程训练的TAR,以及网络推理的结构。三者的键名不同,读取时需要转换。 官方的转换方法已经写在 Lightglue 库中,位于 Lightglue 的大约 470 行。py.官方的转换方法已经写在 Lightglue 库中,位于 Lightglue 的大约 470 行。py。 for i in range(self.conf.n_layers): pattern = f"self_attn.{i}", f"transformers.{i}.self_attn state_dict = {k.replace(*pattern): v for k, v in state_dict.items()} pattern = f"cross_attn.{i}", f"transformers.{i}.cross_attn" state_dict = {k.replace(*pattern): v for k, v in state_dict.items()} 我参考了这种编写方法,并手动检查了每个链接的键名。

您好,我想在自己的数据集上微调训练,我应该怎么做,数据集的格式应该是什么

您可以下载官方所采用的Homography或者megadepth数据集进行参考(虽然经过我实际测试微调效果并不好)

我有更多的问题想请教一下您,方便加个球球吗,963170859

aidongmandexiaowowo commented 3 months ago

I have solved this problem, and the core issue is:

The official PTH, TAR trained through official tutorials, and the structure of network inference. The names of the keys for the three are different and need to be converted when reading.

The official conversion method has been written in the Lightglue library, located at approximately line 470 of Lightglue. py.

for i in range(self.conf.n_layers): pattern = f"self_attn.{i}", f"transformers.{i}.self_attn state_dict = {k.replace(*pattern): v for k, v in state_dict.items()} pattern = f"cross_attn.{i}", f"transformers.{i}.cross_attn" state_dict = {k.replace(*pattern): v for k, v in state_dict.items()}

I referred to this writing method and manually checked the key names for each link.

Hello, I also want to use the official PTH as the initial training value. You said that it needs to be converted when reading. What format should the superpoint_lightglue_v0-1_arxiv.pth file be converted to? How to do it specifically?

zyxzyx45 commented 3 months ago

Hello, can you share the official pre-training weights? Thank you very much