Thank you for providing the code for your work. When using a pre-trained Moment for zero-shot or fine-tuning classification, my code is erroring out with trace pointing to tensors being on two devices. I confirmed that the inputs are on cuda. I found the exact lines (66 and 68 in https://github.com/moment-timeseries-foundation-model/moment/blob/main/momentfm/models/moment.py#L54 ) where this is happening. If I move this linear layer to 'cuda' device explicitly then the code works fine.
Following is a code snippet that I have been using.
model = MOMENTPipeline.from_pretrained(
"AutonLab/MOMENT-1-large",
model_kwargs={
'task_name': 'classification',
'n_channels': 69,
'num_class': 2
}, # We are loading the model in classification mode
).to("cuda").float()
model.init()
def get_logits(model, dataloader):
logits_list = []
with torch.no_grad():
for batch_x, batch_masks, _ in tqdm(dataloader, total=len(dataloader)):
batch_x = batch_x.to("cuda").float()
batch_masks = batch_masks.to("cuda")
output = model(batch_x, input_mask=batch_masks) # [batch_size x d_model (=1024)]
logit = output.logits
logits_list.append(logit.detach().cpu().numpy())
logits_list = np.concatenate(logits_list)
return logits_list
output_flow_logit_test = get_logits(model, dataloader_flow_test)
Loading data... > /pre_wkdir/train_modular_Moments.py(569)get_logits()
-> for batch_x, batch_masks, _ in tqdm(dataloader, total=len(dataloader)):
(Pdb) n
0%| | 0/19 [00:00<?, ?it/s]
> /pre_wkdir/train_modular_Moments.py(570)get_logits()
-> batch_x = batch_x.to("cuda").float()
(Pdb) n
> /pre_wkdir/train_modular_Moments.py(571)get_logits()
-> batch_masks = batch_masks.to("cuda")
(Pdb) n
> /pre_wkdir/train_modular_Moments.py(573)get_logits()
-> output = model(batch_x, input_mask=batch_masks) # [batch_size x d_model (=1024)]
(Pdb) batch_x.is_cuda
True
(Pdb) batch_masks.is_cuda
True
(Pdb) n
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)
> /pre_wkdir/train_modular_Moments.py(573)get_logits()
-> output = model(batch_x, input_mask=batch_masks) # [batch_size x d_model (=1024)]
I would appreciate it if you could comment on this based on your experience of your code development.
Hello,
Thank you for providing the code for your work. When using a pre-trained Moment for zero-shot or fine-tuning classification, my code is erroring out with trace pointing to tensors being on two devices. I confirmed that the inputs are on cuda. I found the exact lines (66 and 68 in https://github.com/moment-timeseries-foundation-model/moment/blob/main/momentfm/models/moment.py#L54 ) where this is happening. If I move this linear layer to 'cuda' device explicitly then the code works fine. Following is a code snippet that I have been using.
I would appreciate it if you could comment on this based on your experience of your code development.
Thanks, Sandhya