Closed rgund26 closed 9 months ago
The error suggest that you have mismatch in architectures in your checkpoint and in the model you are trying to load checkpoint into. E.g you trained M variant but trying to load into S or L version.
Please double-check what model architecture you actually trained and ensure that it matches with the model you are instantiating using models.get
.
Even your traceback has discrepancy.
At the start of post you provide snippet
model = models.get('yolo_nas_m', num_classes= 1, checkpoint_path="ckpt_best.pth")
and down the road it becomes
model = models.get('yolo_nas_l', num_classes= 1, checkpoint_path="ckpt_best.pth")
@rgund26 I'm closing this issue due to inactivity. If the proposed solution did not solve your issue feel free to reopen it.
Same here i have fine-tuned yolo_nas_l model and now not able to load the best_ckpt.pth. this was during fine-tunning
now i'm loading the weights and i'm getting this error
can you help me..
Hello Bhautik,
Most of this happens when you trained model mismatches with real-time Python script
Sorry I'm not able to recall now its been a while.
On Tue, Feb 6, 2024 at 1:14 PM Bhautik Pithadiya @.***> wrote:
Same here i have fine-tuned yolo_nas_l model and now not able to load the best_ckpt.pth. this was during fine-tunning Screenshot.from.2024-02-06.13-12-27.png (view on web) https://github.com/Deci-AI/super-gradients/assets/82807312/c0c98297-b491-4e6d-b69c-3c5648b05022
now i'm loading the weights and i'm getting this error image.png (view on web) https://github.com/Deci-AI/super-gradients/assets/82807312/5a6b6beb-0278-402c-a8e7-5df1d00cb48c image.png (view on web) https://github.com/Deci-AI/super-gradients/assets/82807312/a37c97a5-7fea-42b8-984c-bde497781cdd
can you help me..
— Reply to this email directly, view it on GitHub https://github.com/Deci-AI/super-gradients/issues/1476#issuecomment-1928946117, or unsubscribe https://github.com/notifications/unsubscribe-auth/A7THXJE35LWR55UQNJM5GL3YSHNPNAVCNFSM6AAAAAA472VBD2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMRYHE2DMMJRG4 . You are receiving this because you were mentioned.Message ID: @.***>
Try to load the model with the attribute "checkpoint_num_classes":
model = models.get('yolo_nas_m', num_classes= 1, checkpoint_num_classes="OLD CLASS NUMBER, checkpoint_path="ckpt_best.pth")
💡 Your Question
I have trained the yolo nas custom model. Now I want to sun that model in really time but I am getting error mentioned below Please help me. I cant retrain the whole model, I am in my MS submission.
raise ValueError(f"ckpt layer {ckpt_key} with shape {ckpt_val.shape} does not match {model_key}" f" with shape {model_val.shape} in the model") ValueError: ckpt layer backbone.stage1.blocks.conv1.conv.weight with shape torch.Size([64, 96, 1, 1]) does not match backbone.stage1.blocks.conv1.conv.weight with shape torch.Size([32, 96, 1, 1]) in the model
My python script
import cv2 import torch.cuda from super_gradients.training import models from super_gradients.common.object_names import Models import torch
CLASSES = ['rice']
model = models.get('yolo_nas_m', num_classes= 1, checkpoint_path="ckpt_best.pth")
model = model.to("cuda" if torch.cuda.is_available() else "cpu") model.to(print("cuda") if torch.cuda.is_available() else print("cpu"))
model.predict_webcam()
Versions
S:\Python\Python310\python.exe S:\CV_programming\YOLO-NAS\webcam.py
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "S:\CV_programming\YOLO-NAS\webcam.py", line 11, in
model = models.get('yolo_nas_l', num_classes= 1, checkpoint_path="ckpt_best.pth")
File "S:\Python\Python310\lib\site-packages\super_gradients\training\models\modelfactory.py", line 208, in get
= load_checkpoint_to_model(
File "S:\Python\Python310\lib\site-packages\super_gradients\training\utils\checkpoint_utils.py", line 229, in load_checkpoint_to_model
adaptive_load_state_dict(net, checkpoint, strict)
File "S:\Python\Python310\lib\site-packages\super_gradients\training\utils\checkpoint_utils.py", line 61, in adaptive_load_state_dict
adapted_state_dict = adapt_state_dict_to_fit_model_layer_names(net.state_dict(), state_dict, solver=solver)
File "S:\Python\Python310\lib\site-packages\super_gradients\training\utils\checkpoint_utils.py", line 159, in adapt_state_dict_to_fit_model_layer_names
raise ValueError(f"ckpt layer {ckpt_key} with shape {ckpt_val.shape} does not match {model_key}" f" with shape {model_val.shape} in the model")
ValueError: ckpt layer backbone.stage1.blocks.conv1.conv.weight with shape torch.Size([64, 96, 1, 1]) does not match backbone.stage1.blocks.conv1.conv.weight with shape torch.Size([96, 96, 1, 1]) in the model
Process finished with exit code 1