princeton-nlp / Edge-Pruning

[NeurIPS 2024 Spotlight] Code and data for the paper "Finding Transformer Circuits with Edge Pruning".
https://arxiv.org/abs/2406.16778
MIT License
25 stars 4 forks source link

RuntimeError: Tensor Size Mismatch During Training #2

Open XiaoXiong-Sherry opened 3 weeks ago

XiaoXiong-Sherry commented 3 weeks ago

I am encountering the following error while executing run_scripts/gt_sweep.sh: File "Edge-Pruning/src/modeling/modeling_fpt2.py", line 1293, in forward hidden_states, embeds, z_nodes_sum = self.write(inputs_embeds, position_embeds, corr_x=corr_x) File "Edge-Pruning/src/modeling/modeling_fpt2.py", line 1184, in write tok_embeds = tok_embeds + corr_x[0] * (1 - z_tokens) RuntimeError: The size of tensor a (32) must match the size of tensor b (64) at non-singleton dimension 0 Thank you for your assistance!

testzer0 commented 2 weeks ago

Hi Xiao Xiong, Thank you for bringing this to our attention! I ran the gt_sweep for a sparsity of 0.95 on my end, and did not encounter this error. Could you provide more details, such as which sparsity you got this error for, and whether it popped up at the beginning of training or a while later? If it helps, I am using transformers==4.40.0.dev0, so maybe give that a go? Thanks!

ZekaiZhaostats commented 1 week ago

I also have a similar issue. When I run bash ioi_sweep.sh, it returns me that src/modeling/modeling_fpt2.py", line 1192, in write tok_embeds = tok_embeds + corr_x[0] * (1 - z_tokens) RuntimeError: The size of tensor a (16) must match the size of tensor b (128) at non-singleton dimension 0.