HKUDS / GraphGPT

[SIGIR'2024] "GraphGPT: Graph Instruction Tuning for Large Language Models"
https://arxiv.org/abs/2310.13023
Apache License 2.0
493 stars 36 forks source link

对比学习的一个细节问题 #31

Closed serendipity800 closed 6 months ago

serendipity800 commented 7 months ago

在text-graph grounding代码中,有如下函数: def cal_cl_loss(s_features, t_features, labels): logit_scale = nn.Parameter(torch.ones([]) np.log(1 / 0.07)).exp() logits = logit_scale s_features @ t_features.t() loss_i = F.cross_entropy(logits, labels) loss_t = F.cross_entropy(logits.T, labels) ret_loss = (loss_i + loss_t) / 2 return ret_loss 然而,在反向传播时仅仅优化了model中的参数,这个logit_scale似乎不会被训练,这与G2P2或CLIP开源实现中的对比学习方式有一些区别。请问此处的参数会被训练嘛?

yuh-yang commented 6 months ago

你好,这里的参数不会被训练,但是实际上在G2P2的benchmark中,这个参数训练与否对最终性能几乎没有影响。在我们实现的时候,为了方便地进行多卡训练,就简单地用这种方式将scale固定住。

yuh-yang commented 6 months ago

closed due to inactiveness