Jamie-Stirling / RetNet

An implementation of "Retentive Network: A Successor to Transformer for Large Language Models"
MIT License
1.14k stars 99 forks source link

/src/retnet.py GPU #27

Closed Qiu30 closed 9 months ago

Qiu30 commented 9 months ago

2511 if has_torch_function_variadic(input, weight, bias): 2512 return handle_torch_function( 2513 layer_norm, (input, weight, bias), input, normalized_shape, weight=weight, bias=bias, eps=eps 2514 ) -> 2515 return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument weight in method wrapper_CUDA__native_layer_norm)

Jamie-Stirling commented 9 months ago

Hi, thanks for raising this issue.

Could you provide more details of how you got the error please?

Qiu30 commented 9 months ago

Thank you for your reply. I was able to resolve the issue when re-running the code. Thank you very much.