Closed yxxxqqq closed 3 years ago
Hello @yxxxqqq, thank you for your interest in our work! Please visit our Custom Training Tutorial to get started, and see our Google Colab Notebook, Docker Image, and GCP Quickstart Guide for example environments.
If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.
If this is a custom model or data training question, please note that Ultralytics does not provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as:
For more information please visit https://www.ultralytics.com.
@yxxxqqq thanks for your feedback. Yes you are correct, we use the current method, saving and loading the entire model. In the past we use the alternative method https://github.com/ultralytics/yolov3, of creating a model from a cfg file, and then replacing the random weights with the checkpoints weights using a state_dict().
This method caused two problems. The first is that initialization is slower, as a model is created with random weights, and then those random weights are replaced with the checkpoint weights, creating duplication of effort. The second, and main problem, was that a user was required to supply two items to load a model for inference or testing (the weights and cfg), instead of a single item. This places extra requirements on the user, and introduces a failure point during usage, as the user would often incorrectly match weights with incompatible cfg (i.e. yolov3-spp.pt with yolov3.cfg), leading to errors and confusion, and them raising issues and bug reports, using our time.
So we view the current method as the lesser of two evils. The main downside we see are SourceChangeWarnings that are generated when the modules the model is built on are updated since it was created.
@glenn-jocher Thanks for your reply! I have solved the 'SourceChangeWarnings' by the code you provided.
model = torch.load(weights, map_location=device)['model']
torch.save(torch.load(weights, map_location=device), weights) # update model if SourceChangeWarning
But the problems I said still exists:
1. use original weights, torch.load()
pred before nms: tensor([[[5.38951e+00, 6.87055e+00, 1.14993e+01, ..., 1.90228e-03, 1.01164e-03, 2.54049e-03],
[7.83045e+00, 6.57221e+00, 1.45590e+01, ..., 1.57367e-03, 8.64962e-04, 2.01560e-03],
[2.25311e+01, 5.58812e+00, 1.23454e+01, ..., 1.72529e-03, 9.21386e-04, 2.28453e-03],
...,
[4.31154e+02, 6.14794e+02, 1.36958e+02, ..., 1.80755e-03, 1.52067e-03, 1.51791e-03],
[4.56398e+02, 6.17055e+02, 1.22339e+02, ..., 2.12122e-03, 1.61005e-03, 1.63509e-03],
[4.91976e+02, 6.23088e+02, 1.45217e+02, ..., 3.99010e-03, 1.72312e-03, 2.11344e-03]]], device='cuda:0')
pred after nms: [tensor([[ 44.06211, 235.47171, 162.47781, 537.28436, 0.91711, 0.00000], [146.72403, 240.72610, 219.93156, 511.04062, 0.90797, 0.00000], [412.23538, 237.46272, 497.78629, 522.23077, 0.89330, 0.00000], [ 22.67275, 135.73569, 490.28171, 438.86267, 0.74369, 5.00000], [ 16.38007, 324.36755, 63.95830, 529.78113, 0.54598, 0.00000]], device='cuda:0')]
pred after nms: [None]
@yxxxqqq the behavior you describe is the default behavior of all pytorch models.
For self contained models that do not require any external dependencies or imports you would need to export to onnx or torchscript formats. An alternative solution is to integrate this repo with torch hub https://pytorch.org/hub/.
@glenn-jocher Thank you very much !
@yxxxqqq we recently added support for PyTorch Hub. You may be able to use YOLOv5 in your own repository like this:
import torch
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)
@glenn-jocher wow, so great! thanks for your excellent work!
@yxxxqqq you're welcome!
I use your torch.hub.load
solution in order to have a self-contained detector module,and it works very well, thanks! However, it is very verbose. Even setting verbose=True
in hub.load
still outlines all the library. Is there another less-verbose approach?
@elinor-lev no
Original issue seems resolved, so I am closing this issue now.
@yxxxqqq Hello, will you please explain in detail what you did to resolve the problem? I have run into the same exact nms problem and i cant seem to resolve it even with the hub.load! Thank you
@elinor-lev if you'd like to add verbose functionality to the hub loading, I don't have time to do this personally, but we are open to PRs!
I do it dirty: copy dir models and utils and paste into target dir, this can work.
@yxxxqqq Have you solved it?
I have the same problem with NMS
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)
After NMS: pred: [None]
@yxxxqqq if pred[i] = None
for image i
, you have no detections above threshold in that image.
@glenn-jocher Thanks for the reply
For the same image
In detect.py
When I used
model = attempt_load(weights, map_location=device)
pred = model(img, augment=opt.augment)[0]
pred got 3 dims and after NMS The result is perfect.
I changed it to
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True).to(device)
pred = model(img, augment=opt.augment)[0]
pred got 5 dims and after NMS pred = [None]
Do I need to reshape the pred before NMS?
@1chimaruGin torch hub model may be in training mode rather than eval mode.
@glenn-jocher Ah Thank you.
Got it!
@glenn-jocher
I comment this issue because I got the same problem. I just integrate the detect.py to an existing project but I got the error message that say
model = attempt_load(weights_file, map_location=device)
File "/home/florian/PycharmProjects/eyesr_custom_ai_detector/CustomDetector/detector_files/yolov5/models/experimental.py", line 137, in attempt_load
model.append(torch.load(w, map_location=map_location)['model'].float().fuse().eval()) # load FP32 model
File "/home/florian/PycharmProjects/eyesr_custom_ai_detector/virtual_env/lib/python3.6/site-packages/torch/serialization.py", line 584, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/home/florian/PycharmProjects/eyesr_custom_ai_detector/virtual_env/lib/python3.6/site-packages/torch/serialization.py", line 842, in _load
result = unpickler.load()
ModuleNotFoundError: No module named 'models'
The previous message don't really explain how to fix this (Instead of using back the cfg file and so one). What is exactly the solution to this error message ?
I use Pytorch 1.6 and Python 3.6
@FlorianRuen
I faced the same problem when I use attempt_load(weights_file, map_location=device)
from the outside of this repo.
So, I load the pretrained model from hub model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True).to(device).eval()
@1chimaruGin so you assume that every time we launch the script, it will download from the hub, so we need network access on the device that will execute the project ?
If I change all the line from:
model.append(torch.load(w, map_location=map_location)['model'].float().fuse().eval()) # load FP32 model
to
model.append(torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True).to(map_location).eval())
I got a SSL CERTIFICATE_VERIFY_FAILED, that I can easily correct.
But the other problem, it that I'm using the exact same directory structure. So when I try to run the script it say it can't found utils.google_utils, which is normal because the path should be detector_files.ultralytics.yolov5.utils.google_utils
@FlorianRuen
Yeah It will download from the hub but for once.
I mean in detect.py
line 35. Not in experimental.py
.
model = model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True).to(device).eval()
@1chimaruGin Thanks, but if I run the detect.py without any changes in folders architecture it works, my need is to change the yolov5 folders architecture to fit a bigger project, so I need to change the imports to fit my architecture.
Very strange that if I change the imports in all files, there is still a call to "models" which is wrong path ... I don't know where is this call. I will go deeper to find a solution, if @glenn-jocher has an idea how to fix this ?
Thanks again for your help
have you tried to add
import sys
sys.path.insert(0, "path/to/yolov5")
to the file where the bug occurs?
The same issue here. It is actually very annoying :(
I've trained the small model on a custom dataset and now I am trying to integrate it into another project. I've copied the models and utils folders, fixed the imports there. When I attempt loading the model I get the same issue - ModuleNotFoundError: No module named 'models'. When I run the same model from the original repo, works like a charm.
Has anyone found a solution to the problem? @glenn-jocher do you possibly have any suggestions?
Thanks.
hope this pull request will resolve the issue. See [this branch on my fork] (https://github.com/PetrDvoracek/yolov5/tree/fix-no-module-models-in-export)
The same issue here. It is actually very annoying :(
I've trained the small model on a custom dataset and now I am trying to integrate it into another project. I've copied the models and utils folders, fixed the imports there. When I attempt loading the model I get the same issue - ModuleNotFoundError: No module named 'models'. When I run the same model from the original repo, works like a charm.
Has anyone found a solution to the problem? @glenn-jocher do you possibly have any suggestions?
Thanks.
I meet the same issue, I used the new version and integrate it into another project
torch.load() requires model module in the same folder https://stackoverflow.com/questions/42703500/best-way-to-save-a-trained-model-in-pytorch
torch.save(the_model, PATH)
Then later:the_model = torch.load(PATH)
However in this case, the serialized data is bound to the specific classes and the exact directory structure used, so it can break in various ways when used in other projects, or after some serious refactors.
To fix it save and load only the model parameters. https://github.com/pytorch/pytorch/issues/3678
PyTorch internally uses pickle and it's a limitation of pickle. You can try meddling with sys.path to include the directory where module.py is. This is exactly why we recommend saving only the state dicts and not whole model objects.
I've tried the solution of yxxxqqq, and I've face the problem that he mentioned, And after some effort,
Here is my solution, In attempt_load function, I save the model into state_dicts():
def attempt_load(weights, map_location=None):
# Loads an ensemble of models weights=[a,b,c] or a single model weights=[a] or weights=a
model = Ensemble()
for w in weights if isinstance(weights, list) else [weights]:
attempt_download(w)
model2 = torch.load(w, map_location=map_location)['model']
torch.save(model2.state_dict(), '/path/to/best_state_model.pth')
....
# Compatibility updates
for m in model.modules():
if type(m) in [nn.Hardswish, nn.LeakyReLU, nn.ReLU, nn.ReLU6]:
m.inplace = True # pytorch 1.7.0 compatibility
elif type(m) is Conv:
m._non_persistent_buffers_set = set() # pytorch 1.6.0 compatibility
print("len model = {}".format(len(model)))
if len(model) == 1:
print("pass only")
print()
return model[-1] # return model
else:
print('Ensemble created with %s\n' % weights)
for k in ['names', 'stride']:
setattr(model, k, getattr(model[-1], k))
return model # return ensemble
And I've load the weights using:
model = Model(cfg='/path/to/yolov5s.yaml',
nc=1)
print(model.state_dict().keys())
print(len(model.eval().state_dict().keys()))
model.load_state_dict(torch.load('/path/to/best_state_model.pth', map_location=device))
When I apply these changes, the results of two loading model method are the same.
I hope that my solution can help someone.
I've tried the solution of yxxxqqq, and I've face the problem that he mentioned, And after some effort,
Here is my solution, In attempt_load function, I save the model into state_dicts():
def attempt_load(weights, map_location=None): # Loads an ensemble of models weights=[a,b,c] or a single model weights=[a] or weights=a model = Ensemble() for w in weights if isinstance(weights, list) else [weights]: attempt_download(w) model2 = torch.load(w, map_location=map_location)['model'] torch.save(model2.state_dict(), '/path/to/best_state_model.pth') .... # Compatibility updates for m in model.modules(): if type(m) in [nn.Hardswish, nn.LeakyReLU, nn.ReLU, nn.ReLU6]: m.inplace = True # pytorch 1.7.0 compatibility elif type(m) is Conv: m._non_persistent_buffers_set = set() # pytorch 1.6.0 compatibility print("len model = {}".format(len(model))) if len(model) == 1: print("pass only") print() return model[-1] # return model else: print('Ensemble created with %s\n' % weights) for k in ['names', 'stride']: setattr(model, k, getattr(model[-1], k)) return model # return ensemble
And I've load the weights using:
model = Model(cfg='/path/to/yolov5s.yaml', nc=1) print(model.state_dict().keys()) print(len(model.eval().state_dict().keys())) model.load_state_dict(torch.load('/path/to/best_state_model.pth', map_location=device))
When I apply these changes, the results of two loading model method are the same.
I hope that my solution can help someone.
Do you mean modify the attempt_load function in official repo and output the dict and use the dict to your own project ?
@Stephenfang51 @Mostafa-Elmenbawy @hungthanhpham94 to run custom or official YOLOv5 models within separate projects see PyTorch Hub Tutorial: https://docs.ultralytics.com/yolov5
@Stephenfang51 @Mostafa-Elmenbawy @hungthanhpham94 to run custom or official YOLOv5 models within separate projects see PyTorch Hub Tutorial: https://docs.ultralytics.com/yolov5
what about yolov3? Thanks!
@Stephenfang51 yes, PyTorch Hub works with ultralytics/yolov3 also.
@Stephenfang51 yes, PyTorch Hub works with ultralytics/yolov3 also.
model = torch.hub.load('ultralytics/yolov3', "yolov3-tiny", classes=2)
ckpt = torch.load(weights)
model.load_state_dict(ckpt['model'].state_dict()) # load state_dict
model.names = ckpt.names # define class names
and results error
raise RuntimeError('Cannot find callable {} in hubconf'.format(model))
RuntimeError: Cannot find callable yolov3-tiny in hubconf
@Stephenfang51 dash characters are not supported in any PyTorch Hub models.
@Stephenfang51 dash characters are not supported in any PyTorch Hub models.
Thanks for the quick reply, however..
model.names = ckpt.names # define class names
AttributeError: 'dict' object has no attribute 'names'
Tutorial is updated, names is a model attribute. See https://docs.ultralytics.com/yolov5/tutorials/pytorch_hub_model_loading
Tutorial is updated, names is a model attribute. See #36
sorry i am new to your excellent project, but still dont get it. and I didn't find answer in #36
See PyTorch Hub Tutorial for directions: https://docs.ultralytics.com/yolov5
To load a custom model, first load a PyTorch Hub model of the same architecture with the same number of classes, and then load a custom state dict into it. This examples loads a custom 10-class YOLOv5s model 'yolov5s_10cls.pt'
:
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', classes=10)
ckpt = torch.load('yolov5s_10cls.pt')['model'] # load checkpoint
model.load_state_dict(ckpt.state_dict()) # load state_dict
model.names = ckpt.names # define class names
See PyTorch Hub Tutorial for directions: https://docs.ultralytics.com/yolov5
Custom Models
To load a custom model, first load a PyTorch Hub model of the same architecture with the same number of classes, and then load a custom state dict into it. This examples loads a custom 10-class YOLOv5s model
'yolov5s_10cls.pt'
:model = torch.hub.load('ultralytics/yolov5', 'yolov5s', classes=10) ckpt = torch.load('yolov5s_10cls.pt')['model'] # load checkpoint model.load_state_dict(ckpt.state_dict()) # load state_dict model.names = ckpt.names # define class names
here is mine
# Load model
model = torch.hub.load('ultralytics/yolov3', "yolov3_tiny", classes=2)
ckpt = torch.load(weights)['model']
model.load_state_dict(ckpt.state_dict())
model.names = ckpt.names
img = Image.open(source_img)
# print(model)
results = model(img, size=img_size)
results.print()
errors
TypeError: forward() got an unexpected keyword argument 'size'
@Stephenfang51 only .autoshape() models can accept PIL images. The normal model you are using only accepts standard pytorch inputs.
@Stephenfang51 only .autoshape() models can accept PIL images. The normal model you are using only accepts standard pytorch inputs.
so should I convert PIL images to torch.tensor ?
@Stephenfang51 yes you can do that, or you can convert your model into an .autoshape() model as the tutorial shows.
model = model.autoshape()
@Stephenfang51 yes you can do that, or you can convert your model into an .autoshape() model as the tutorial shows.
model = model.autoshape()
after your suggestion
RuntimeError: Error(s) in loading state_dict for autoShape:
Missing key(s) in state_dict: "model.model.0.conv.weight", "model.model.0.bn.weight", "model.model.0.bn.bias", "model.model.0.bn.running_mean", "model.model.0.bn.running_var", "model.model.2.conv.weight", "model.model.2.bn.weight", "model.model.2.bn.bias", "model.model.2.bn.running_mean", "model.model.2.bn.running_var", "model.model.4.conv.weight", "model.model.4.bn.weight", "model.model.4.bn.bias", "model.model.4.bn.running_mean", "model.model.4.bn.running_var", "model.model.6.conv.weight", "model.model.6.bn.weight", "model.model.6.bn.bias", "model.model.6.bn.running_mean", "model.model.6.bn.running_var", "model.model.8.conv.weight", "model.model.8.bn.weight", "model.model.8.bn.bias", "model.model.8.bn.running_mean", "model.model.8.bn.running_var", "model.model.10.conv.weight", "model.model.10.bn.weight", "model.model.10.bn.bias", "model.model.10.bn.running_mean", "model.model.10.bn.running_var", "model.model.13.conv.weight", "model.model.13.bn.weight", "model.model.13.bn.bias", "model.model.13.bn.running_mean", "model.model.13.bn.running_var", "model.model.14.conv.weight", "model.model.14.bn.weight", "model.model.14.bn.bias", "model.model.14.bn.running_mean", "model.model.14.bn.running_var", "model.model.15.conv.weight", "model.model.15.bn.weight", "model.model.15.bn.bias", "model.model.15.bn.running_mean", "model.model.15.bn.running_var", "model.model.16.conv.weight", "model.model.16.bn.weight", "model.model.16.bn.bias", "model.model.16.bn.running_mean", "model.model.16.bn.running_var", "model.model.19.conv.weight", "model.model.19.bn.weight", "model.model.19.bn.bias", "model.model.19.bn.running_mean", "model.model.19.bn.running_var", "model.model.20.anchors", "model.model.20.anchor_grid", "model.model.20.m.0.weight", "model.model.20.m.0.bias", "model.model.20.m.1.weight", "model.model.20.m.1.bias".
Unexpected key(s) in state_dict: "model.0.conv.weight", "model.0.bn.weight", "model.0.bn.bias", "model.0.bn.running_mean", "model.0.bn.running_var", "model.0.bn.num_batches_tracked", "model.2.conv.weight", "model.2.bn.weight", "model.2.bn.bias", "model.2.bn.running_mean", "model.2.bn.running_var", "model.2.bn.num_batches_tracked", "model.4.conv.weight", "model.4.bn.weight", "model.4.bn.bias", "model.4.bn.running_mean", "model.4.bn.running_var", "model.4.bn.num_batches_tracked", "model.6.conv.weight", "model.6.bn.weight", "model.6.bn.bias", "model.6.bn.running_mean", "model.6.bn.running_var", "model.6.bn.num_batches_tracked", "model.8.conv.weight", "model.8.bn.weight", "model.8.bn.bias", "model.8.bn.running_mean", "model.8.bn.running_var", "model.8.bn.num_batches_tracked", "model.10.conv.weight", "model.10.bn.weight", "model.10.bn.bias", "model.10.bn.running_mean", "model.10.bn.running_var", "model.10.bn.num_batches_tracked", "model.13.conv.weight", "model.13.bn.weight", "model.13.bn.bias", "model.13.bn.running_mean", "model.13.bn.running_var", "model.13.bn.num_batches_tracked", "model.14.conv.weight", "model.14.bn.weight", "model.14.bn.bias", "model.14.bn.running_mean", "model.14.bn.running_var", "model.14.bn.num_batches_tracked", "model.15.conv.weight", "model.15.bn.weight", "model.15.bn.bias", "model.15.bn.running_mean", "model.15.bn.running_var", "model.15.bn.num_batches_tracked", "model.16.conv.weight", "model.16.bn.weight", "model.16.bn.bias", "model.16.bn.running_mean", "model.16.bn.running_var", "model.16.bn.num_batches_tracked", "model.19.conv.weight", "model.19.bn.weight", "model.19.bn.bias", "model.19.bn.running_mean", "model.19.bn.running_var", "model.19.bn.num_batches_tracked", "model.20.anchors", "model.20.anchor_grid", "model.20.m.0.weight", "model.20.m.0.bias", "model.20.m.1.weight", "model.20.m.1.bias".
anything wrong with my code ? I just gave up your yolov3 and try yolov5, still error my torch version is 1.6
weights = "yolov5_pretrained/best.pt"
model = torch.hub.load('ultralytics/yolov5', "yolov5s", classes=2)
ckpt = torch.load(weights)['model']
model.load_state_dict(ckpt.state_dict())
model.names = ckpt.names
img = Image.open(source_img)
results = model(img, size=640)
results.print()
errors
result = self.forward(*input, **kwargs)
TypeError: forward() got an unexpected keyword argument 'size'
@Stephenfang51 I'll try to produce an example using loading a 20-class VOC trained model.
Remember if you are using YOLOv3, all older models trained with the archive branch are not forward compatible. To load a custom YOLOv3 model in PyTorch Hub, it must be trained with the new master branch that is YOLOv5 forward compatible.
From ultralytics/yolov3:
BRANCH NOTICE: The ultralytics/yolov3 repository is now divided into two branches:
$ git clone https://github.com/ultralytics/yolov3 # master branch (default)
$ git clone -b archive https://github.com/ultralytics/yolov3 # archive branch
TODO: Simplify custom model loading, i.e.
model = torch.hub.load('ultralytics/yolov5', 'custom', weights='yolov5_custom.pt')
@Stephenfang51 I've updated YOLOv5 PyTorch Hub functionality to allow for much simpler loading of custom models of any architecture created with our YOLOv3/5 repos. Please git pull to receive the latest updates, and then try the new method (you may need to use force_reload=True
to update your hub cache):
This example loads a custom 20-class VOC-trained YOLOv5s model 'yolov5s_voc_best.pt'
with PyTorch Hub.
model = torch.hub.load('ultralytics/yolov5', 'custom', path_or_model='yolov5s_voc_best.pt', force_reload=True)
model = model.autoshape() # for PIL/cv2/np inputs and NMS
@glenn-jocher
I think the reason is the linemodel = model.autoShape()
should be put under the model.load_state_dict(checkpoint.state_dict())
, I tried the following and it works now!
if __name__ == '__main__':
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=False, classes=2)
checkpoint = torch.load('yolov5_pretrained/best.pt')['model']
model.load_state_dict(checkpoint.state_dict())
model = model.autoshape() # for PIL/cv2/np inputs and NMS
model.names = checkpoint.names
img = Image.open('demo_video/image_079.jpg')
pred = model(img)
pred.print()
pred.save()
And of course I used the Yolov3 (master)which is compatible with your Yolov5. the solution solved now anyway
Thanks for your Help 👍
My environment and problem:
There is no problem for object detection, and it's a great job, thank you!
However, I want to use this repo as a detector in my project, which is the first stage. But I can't use 'torch.load()' to load the weights you provided, get the error as follows:
My solution
New problem
I don't know what's the problem it is? And I don't understand why you use this save method instead of another more flexible way? About my problem, do you have any good ideas? Thank you very much!