RuntimeError: expected backend CPU and dtype Float but got backend CPU and dtype Long

code

import torch from mixture_of_experts import HeirarchicalMoE

moe = HeirarchicalMoE( dim = 512, num_experts = (4, 4), # 4 gates on the first layer, then 4 experts on the second, equaling 16 experts )

inputs = torch.randn(4, 1024, 512) out, aux_loss = moe(inputs) # (4, 1024, 512), (1,)

Traceback (most recent call last): File "/home/bi/panlu/ComplexQG-MOE/test/test3.py", line 20, in out, aux_loss = moe(inputs) # (4, 1024, 512), (1,) File "/home/bi/software/anaconda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(*input, *kwargs) File "/home/bi/software/anaconda/lib/python3.6/site-packages/mixture_of_experts/mixture_of_experts.py", line 254, in forward dispatch_tensor, combine_tensor, loss = self.gate(inputs) File "/home/bi/software/anaconda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(input, **kwargs) File "/home/bi/software/anaconda/lib/python3.6/site-packages/mixture_of_experts/mixture_of_experts.py", line 217, in forward

safe_one_hot(position_in_expert_1.long(), expert_capacity)[..., None, :] + RuntimeError: expected backend CPU and dtype Float but got backend CPU and dtype Long

lucidrains / mixture-of-experts

RuntimeError: expected backend CPU and dtype Float but got backend CPU and dtype Long #2