您好,我们有提供QV网络的权重么?我在实验室服务器上部署了webshop和Archer,利用我们提供的Checkpoint进行验证,发现前20轮Agent依然无法给出合理的输出。
在进行actor_loss的时候,程序无法给出合理的log_prob,在 archer_agetn.py的106行
outputs = self.model(input_ids=input_ids, attention_mask = attention_mask)
报错:
RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.cuda.FloatTensor instead (while checki
ng arguments for embedding)
……
Unfortunately we do not have saved QV networks. Do you mean that you were not able to successfully load the sft policy and the sft policy does not output reasonable actions when it is loaded?
您好,我们有提供QV网络的权重么?我在实验室服务器上部署了webshop和Archer,利用我们提供的Checkpoint进行验证,发现前20轮Agent依然无法给出合理的输出。 在进行actor_loss的时候,程序无法给出合理的log_prob,在 archer_agetn.py的106行
outputs = self.model(input_ids=input_ids, attention_mask = attention_mask)
报错: RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.cuda.FloatTensor instead (while checki ng arguments for embedding) ……