Non-compatibility of the classification head for gpu on zeroshot classification task

Hello,

Thank you for providing the code for your work. When using a pre-trained Moment for zero-shot or fine-tuning classification, my code is erroring out with trace pointing to tensors being on two devices. I confirmed that the inputs are on cuda. I found the exact lines (66 and 68 in https://github.com/moment-timeseries-foundation-model/moment/blob/main/momentfm/models/moment.py#L54 ) where this is happening. If I move this linear layer to 'cuda' device explicitly then the code works fine. Following is a code snippet that I have been using.

model = MOMENTPipeline.from_pretrained(
    "AutonLab/MOMENT-1-large",
    model_kwargs={
        'task_name': 'classification',
        'n_channels': 69,
        'num_class': 2
    },  # We are loading the model in classification mode
).to("cuda").float()
model.init()

def get_logits(model, dataloader):
    logits_list = []
    with torch.no_grad():
        for batch_x, batch_masks, _ in tqdm(dataloader, total=len(dataloader)):
            batch_x = batch_x.to("cuda").float()
            batch_masks = batch_masks.to("cuda")

            output = model(batch_x, input_mask=batch_masks)  # [batch_size x d_model (=1024)]
            logit = output.logits
            logits_list.append(logit.detach().cpu().numpy())
    logits_list = np.concatenate(logits_list)
    return logits_list

output_flow_logit_test = get_logits(model, dataloader_flow_test)

Loading data... > /pre_wkdir/train_modular_Moments.py(569)get_logits()
-> for batch_x, batch_masks, _ in tqdm(dataloader, total=len(dataloader)):
(Pdb) n
  0%|                                                                                                                                                                                       | 0/19 [00:00<?, ?it/s]
> /pre_wkdir/train_modular_Moments.py(570)get_logits()
-> batch_x = batch_x.to("cuda").float()
(Pdb) n
> /pre_wkdir/train_modular_Moments.py(571)get_logits()
-> batch_masks = batch_masks.to("cuda")
(Pdb) n
> /pre_wkdir/train_modular_Moments.py(573)get_logits()
-> output = model(batch_x, input_mask=batch_masks)  # [batch_size x d_model (=1024)]
(Pdb) batch_x.is_cuda
True
(Pdb) batch_masks.is_cuda
True
(Pdb) n
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)
> /pre_wkdir/train_modular_Moments.py(573)get_logits()
-> output = model(batch_x, input_mask=batch_masks)  # [batch_size x d_model (=1024)]

I would appreciate it if you could comment on this based on your experience of your code development.

Thanks, Sandhya

moment-timeseries-foundation-model / moment

Non-compatibility of the classification head for gpu on zeroshot classification task #33