也许QV网络不太稳定？

YifeiZhou02 / ArCHer

Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"

84 stars 10 forks source link

您好，我们有提供QV网络的权重么？我在实验室服务器上部署了webshop和Archer，利用我们提供的Checkpoint进行验证，发现前20轮Agent依然无法给出合理的输出。在进行actor_loss的时候，程序无法给出合理的log_prob,在 archer_agetn.py的106行 outputs = self.model(input_ids=input_ids, attention_mask = attention_mask) 报错： RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.cuda.FloatTensor instead (while checki ng arguments for embedding) ……

YifeiZhou02 / ArCHer

也许QV网络不太稳定？ #7