程序报错 - Githubissues

RENNY-Jenius commented 10 months ago

在运行attention_attr.py的时候，发现报错 Traceback (most recent call last): File "/home/runyu.cai/label/attention_attr.py", line 144, in pro = get_proportion(saliency, class_poss, final_poss) File "/home/runyu.cai/label/attention_attr.py", line 112, in get_proportion assert len(saliency.shape) == 2 or (len(saliency.shape) == 3 and saliency.shape[0] == 1) AssertionError 请问是为什么呢？

leanwang326 commented 10 months ago

哦这个问题是在于analysis_dataloader的batch_size不为1，这边是要求batch_size为1（因为不同input的label target的位置不一样，后面get_proportion不好放在一起算）。然后的话这个代码理论上是设置了CUDA_VISIBLE_DEVICES的，但是由于设置（set_gpu）是在导入transformers之后，如果你的transformers有涉及到accelerate/deepseed之类的和cuda有关的，set_gpu会无效，模型会获取到你机器上的所有gpu，导致虽然设置了per_device_eval_batch_size=1，总的batch_size还会是num_gpu个。

简单的解决办法是在运行的时候用CUDA_VISIBLE_DEVICES=0 python attention_attr.py这样的. 如果你想用experiment_attn_attr.py一次性跑多个程序的话，在run.run()前面加上run.explicit_set_gpu = True

RENNY-Jenius commented 10 months ago

感谢，已经运行成功，新年快乐，期望能够有更多交流

2023年12月31日 12:36，Wang Lean @.***> 写道：

哦这个问题是在于analysis_dataloader的batch_size不为1，这边是要求batch_size为1（因为不同input的label target的位置不一样，后面get_proportion不好放在一起算）。然后的话这个代码理论上是设置了CUDA_VISIBLE_DEVICES的，但是由于设置（set_gpu）是在导入transformers之后，如果你的transformers有涉及到accelerate/deepseed之类的和cuda有关的，set_gpu会无效，模型会获取到你机器上的所有gpu，导致虽然设置了per_device_eval_batch_size=1，总的batch_size还会是num_gpu个。

简单的解决办法是在运行的时候用CUDA_VISIBLE_DEVICES=0 python attention_attr.py这样的. 如果你想用experiment_attn_attr.py一次性跑多个程序的话，在run.run()前面加上run.explicit_set_gpu = True

— Reply to this email directly, view it on GitHub https://github.com/lancopku/label-words-are-anchors/issues/12#issuecomment-1872668746, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASZIZNZYEWUXI5RMCVIJ5R3YMDTWHAVCNFSM6AAAAABBHVVL7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZSGY3DQNZUGY. You are receiving this because you authored the thread.

lancopku / label-words-are-anchors

程序报错 #12