Device incompatibility?

princeton-nlp / CoFiPruning

[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408

MIT License

189 stars 31 forks source link

Device incompatibility? #12

Closed ctsan closed 2 years ago

ctsan commented 2 years ago

Hello,

In the following line: https://github.com/princeton-nlp/CoFiPruning/blob/022847ae88f49fa7b8fc58f9c0613492fd1230cc/trainer/trainer.py#L599

existing_layers tensor is in cpu and the result of indexes<last_aligned_layer is in gpu. This throws an error as a result

Is this a bug? maybe first move existing_layers to gpu?

xiamengzhou commented 2 years ago

Hi,

Yes, it is bug. I just made a commit to move existing_layers to GPU. It resulted from a previous bug in which existing_layers was not used. It should not have a big impact on the results. Let me know if you still encounter issues!

ctsan commented 2 years ago

The fix throws an error for me:

RuntimeError: "bitwise_and_cuda" not implemented for 'Float'

probably because the result after the .to(layerwiseloss) is a float tensor, which cannot be used in a bitwise and (&).

Do you not use cuda? .to("cuda") makes the error go away for me

xiamengzhou commented 2 years ago

it should be .to(layerwiseloss.device), which is cuda, let me know if it works!

ctsan commented 2 years ago

seems to work now! thank you for the quick responses