Open linyuan13 opened 4 months ago
Hi, could you edit the bug report according to the template? It's quite hard to understand what is the error from just the above.
Thank you very much for your reply. I have solved this problem. But there is a new problem. It stops early during fine-tuning, and the MSE and other effects are not good during verification. The following is the early stopping log Loading weights from local directory GPU available: True (cuda), used: True TPU available: False, using: 0 TPU cores HPU available: False, using: 0 HPUs [2024-08-02 08:25:59,964][datasets][INFO] - PyTorch version 2.3.1 available. [2024-08-02 08:25:59,964][datasets][INFO] - JAX version 0.4.30 available. Seed set to 1 [rank: 0] Seed set to 1 Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/2 Loading weights from local directory [2024-08-02 08:26:04,015][datasets][INFO] - PyTorch version 2.3.1 available. [2024-08-02 08:26:04,015][datasets][INFO] - JAX version 0.4.30 available. [rank: 1] Seed set to 1 [rank: 1] Seed set to 1 Initial izing distributed: GLOBAL_RANK: 1, MEMBER: 2/2 ------------------------------------------------------------------------------------------------ distributed_backend=nccl All distributed processes registered. Starting with 2 processes ------------------------------------------------------------------------------------------------ /anaconda3/envs/uni/lib/python3.11/site-packages/lightning/pytorch/callbacks/model_checkpoint.py:652: Checkpoint directory uni2ts-main/outputs/finetune/moirai_1.0_R_small/etth1/finetune1/checkpoints exists and is not empty. LOCAL_RANK: 1 - CUDA_VISIBLE_DEVICES: [0,1] LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1] | Name | Type | Params | Mode -------------------------------------------------- 0 | module | MoiraiMod ule | 13.8 M | train ------------------------------------------------ 13.8 M Trainable params 0 Non-trainable params 13.8 M Total params 55.310 Total estimated model params size (MB) Epoch 0: | val/PackedMSELoss=11.40, val/Pack[rank: 0] Metric val/PackedNLLLoss improved. New best score: 2.069 [rank: 1] Metric val/PackedNLLLoss improved. New best score: 2.158 Epoch 3: | val/PackedNLLLoss=3.900, val/PackedMSELoss=11.90, val/Pack[rank: 0] Monitored metric val/PackedNLLLoss did not improve in the last 3 records. Best score: 2.069. Signaling Trainer to stop. [rank: 1] Monitored metric val/PackedNLLLoss did not improve in the last 3 records. Best score: 2.158. Signaling Trainer to stop. Epoch 3: | .py:254: UserWarning: resource_tracker: There appear to be 22 leaked semaphore objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d ' anaconda3/envs/uni/lib/python3.11/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 22 leaked semaphore objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d ' The parameters are consistent with what you provided, and the GPU model is A100
Describe the bug Error executing job with overrides: ['run_name=first_run', 'model=moirai_1.0_R_small', 'data=etth1', 'val_data=etth1'] Error in call to target 'huggingface_hub.hub_mixin.ModelHubMixin.from_pretrained': TypeError("MoiraiModule.init() missing 7 required positional arguments: 'distr_output', 'd_model', 'num_layers', 'patch_sizes', 'max_seq_len', 'attn_dropout_p', and 'dropout_p'") full_key: model.module
I followed the process exactly, but an error occurred when I used the command to make fine adjustments at the last step
I am having the same bug now, could you please share your ways to solve it? Thanks a lot!
Describe the bug Error executing job with overrides: ['run_name=first_run', 'model=moirai_1.0_R_small', 'data=etth1', 'val_data=etth1'] Error in call to target 'huggingface_hub.hub_mixin.ModelHubMixin.from_pretrained': TypeError("MoiraiModule.init() missing 7 required positional arguments: 'distr_output', 'd_model', 'num_layers', 'patch_sizes', 'max_seq_len', 'attn_dropout_p', and 'dropout_p'") full_key: model.module
I followed the process exactly, but an error occurred when I used the command to make fine adjustments at the last step