The transformer decoder inference only works on cuda:0. If placed on other gpus, it gives the following error or causes the kernel to restart. Both the data and model are placed on the same gpu.
/usr/local/lib/python3.7/dist-packages/turbo_transformers/layers/modeling_decoder.py in __call__(self, input_tensor, return_type, is_trans_weight, output)
285 super(PositionwiseFeedForward, self).__call__(input_tensor, output,
286 is_trans_weight)
--> 287 return convert_returns_as_type(output, return_type)
288
289 @staticmethod
/usr/local/lib/python3.7/dist-packages/turbo_transformers/layers/return_type.py in convert_returns_as_type(tensor, rtype)
39 return tensor
40 elif rtype == ReturnType.TORCH:
---> 41 return dlpack.from_dlpack(tensor.to_dlpack())
42 else:
43 raise NotImplementedError()
RuntimeError: Specified device cuda:1 does not match device of data cuda:0
Here is an example to reproduce the error. This example simply causes the kernel to restart without giving error messages.
The transformer decoder inference only works on cuda:0. If placed on other gpus, it gives the following error or causes the kernel to restart. Both the data and model are placed on the same gpu.
Here is an example to reproduce the error. This example simply causes the kernel to restart without giving error messages.
Thanks in advance for your help!