Fine-tuning - Githubissues

nooshinmaghsoodi commented 5 months ago

Hi, How I can do fine-tuning the pre-trained model with my dataset?

mononitogoswami commented 5 months ago

Hi Nooshin, Thanks for your interest in MOMENT! Could you please tell us a bit more about the tasks you want to use MOMENT for?

nooshinmaghsoodi commented 5 months ago

Hi Mononita,

Thank you for your reply. I have an ECG dataset and I want to do AFib classification using MOMENT. Actually, now I used the current pre-trained model to extract embeddings. However, the result was not good. I think running a fine-tuning step can increase the performance.

Best, Nooshin

http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail Virus-free.www.avg.com http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

On Wed, May 22, 2024 at 1:56 PM Mononito Goswami @.***> wrote:

Hi Nooshin, Thanks for your interest in MOMENT! Could you please tell us a bit more about the tasks you want to use MOMENT for?

— Reply to this email directly, view it on GitHub https://github.com/moment-timeseries-foundation-model/moment/issues/14#issuecomment-2125428963, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3EZUS7HANBPDFEUDCS5JLTZDTL5BAVCNFSM6AAAAABIDVI5DWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRVGQZDQOJWGM . You are receiving this because you authored the thread.Message ID: @.*** com>

mononitogoswami commented 5 months ago

Hi Nooshin,

Some possible reasons for lack of performance may be:

The data being out of distribution for the model, in which case fine-tuning might help.
Modeling multivariate data -- ECG data is multivariate, so depending on how embeddings are generated, the performance may change. There are 2 ways to generate embeddings for multivariate data using: (1) generate a 1024-dimensional embedding by averaging the embeddings from all patches and channels, and (2) generate a 1024*C-dimensional embeddings by only averaging the embeddings from all patches. If you have been doing (1), (2) might improve the performance of downstream classification.
Classification head-- In my experience, using a statistical ML model (e.g., SVM) trained on the embeddings is much more stable the linear classification head.

Recommendations for Fine-tuning

I would recommend fine-tuning the model using masked reconstruction.


from momentfm import MOMENTPipeline

model = MOMENTPipeline.from_pretrained(
  "AutonLab/MOMENT-1-large", 
  model_kwargs={
      'task_name': 'reconstruction',
      'forecast_horizon': 192,
      'head_dropout': 0.1,
      'weight_decay': 0,
      'freeze_encoder': False, # Freeze the patch embedding layer. True by default.
      'freeze_embedder': False, # Freeze the transformer encoder. True by default.
      'freeze_head': False, # False by default
  },
)

By default the encoder and the patch embedding layer are frozen. I would recommend not freezing them for your experiment.

I recommend you check the forecasting script on more details on how to fine-tune the model. Fine-tuning MOMENT is just like fine-tuning any other deep learning model!

I would also suggest looking at parameter-efficient fine-tuning techniques such as LoRA for fine-tuning. LoRa might make the fine-tuning process more efficient and may also improve predictive accuracy.

Questions

Could you tell us how bad the performance was in comparison to a reasonable baseline?

Hope this helps! Let us know if you have any more questions!

CREATEGroup2 commented 5 months ago

Thank you for your explanation! Actually, I am looking for the research code of MOMENT so that I can modify the fine-tuning process by freezing different layers and adding some task-relevant adapter layers.

I noticed that even when we load the model in classification mode, it initially loads in reconstruction mode. My experiment was on an imbalanced binary ECG classification task, and despite the common practice of reporting results based on accuracy in the papers related to foundation models, this metric is not suitable for my case. In fact, while I achieved an accuracy of 0.83 using SVM on MOMENT's embeddings, the F1-score was only 0.21. This is very low compared to a CNN-based approach that achieved an F1-score of 0.89.

shaozheliu commented 4 months ago

from peft import LoraConfig, get_peft_model

from momentfm import MOMENTPipeline

model = MOMENTPipeline.from_pretrained(
   "/hy-tmp/better464/MOMENT-1-large", 
    model_kwargs={
        'task_name': 'forecasting',
        'forecast_horizon': 192,
        'head_dropout': 0.1,
        'weight_decay': 0,
        'freeze_encoder': False, # Freeze the patch embedding layer
        'freeze_embedder': False, # Freeze the transformer encoder
        'freeze_head': False, # The linear forecasting head must be trained 否则输出会不一致
    },
)

lora_config = LoraConfig(
                                        r=64,
                                        lora_alpha=32,
                                        target_modules=["q", "v"],
                                        lora_dropout=0.05,
                                        )
model_new = get_peft_model(model, lora_config)
print('LoRA enabled')
model_new.print_trainable_parameters()

File /usr/local/miniconda3/envs/moment/lib/python3.11/site-packages/accelerate/accelerator.py:34 32 import torch 33 import torch.utils.hooks as hooks ---> 34 from huggingface_hub import split_torch_state_dict_into_shards 36 from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state 37 from .data_loader import DataLoaderDispatcher, prepare_data_loader, skip_first_batches

ImportError: cannot import name 'split_torch_state_dict_into_shards' from 'huggingface_hub' (/usr/local/miniconda3/envs/moment/lib/python3.11/site-packages/huggingface_hub/init.py)

Questions

How I can do fine-tuning the pre-trained model for forecasting using LoRA? I modified the code in the demo, but it doesn't seem to work well, thanks. (Python packages are all versions required in the requirements.）

mononitogoswami commented 3 months ago

Hi all, Thanks for your continued interest in MOMENT. We just released some example notebooks to fine-tune pre-trained MOMENT, as well as MOMENT research code (https://github.com/moment-timeseries-foundation-model/moment-research). We hope these resources help your research.

Please let us know if you have any questions or concerns, and please don't hesitate to open another issue!

Best, Mononito

moment-timeseries-foundation-model / moment

Fine-tuning #14

Recommendations for Fine-tuning

Questions

Questions