Closed e-shawakri closed 3 years ago
Hello @e-shawakri, thank you for your interest in our work! Please visit our Custom Training Tutorial to get started, and see our Jupyter Notebook , Docker Image, and Google Cloud Quickstart Guide for example environments.
If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.
If this is a custom model or data training question, please note that Ultralytics does not provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as:
For more information please visit https://www.ultralytics.com.
The same issue. I built PyTorch from source(master branch, version-1.6.0) because I got a CUDA-10.0 environment. When I try to train from a pretrained checkpoint by
python train.py --img 640 --batch 16 --epochs 10 --data ./data/coco128.yaml --cfg ./models/yolov5s.yaml --weights 'weights/yolov5s.pt'
I got the error as follows
/home/xxx/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages/torch/serialization.py:646: SourceChangeWarning: source code of class 'torch.nn.modules.container.Sequential' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
/home/xxx/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages/torch/serialization.py:646: SourceChangeWarning: source code of class 'torch.nn.modules.conv.Conv2d' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
/home/xxx/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages/torch/serialization.py:646: SourceChangeWarning: source code of class 'torch.nn.modules.batchnorm.BatchNorm2d' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
/home/xxx/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages/torch/serialization.py:646: SourceChangeWarning: source code of class 'torch.nn.modules.activation.LeakyReLU' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
/home/xxx/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages/torch/serialization.py:646: SourceChangeWarning: source code of class 'torch.nn.modules.container.ModuleList' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
/home/xxx/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages/torch/serialization.py:646: SourceChangeWarning: source code of class 'torch.nn.modules.pooling.MaxPool2d' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
/home/xxx/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages/torch/serialization.py:646: SourceChangeWarning: source code of class 'torch.nn.modules.upsampling.Upsample' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
Traceback (most recent call last):
File "train.py", line 398, in <module>
train(hyp)
File "train.py", line 117, in train
{k: v for k, v in ckpt['model'].state_dict().items() if model.state_dict()[k].numel() == v.numel()}
File "/home/xxx/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages/torch/nn/modules/module.py", line 783, in state_dict
module.state_dict(destination, prefix + name + '.', keep_vars=keep_vars)
File "/home/xxx/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages/torch/nn/modules/module.py", line 783, in state_dict
module.state_dict(destination, prefix + name + '.', keep_vars=keep_vars)
File "/home/xxx/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages/torch/nn/modules/module.py", line 783, in state_dict
module.state_dict(destination, prefix + name + '.', keep_vars=keep_vars)
[Previous line repeated 1 more time]
File "/home/xxx/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages/torch/nn/modules/module.py", line 780, in state_dict
self._save_to_state_dict(destination, prefix, keep_vars)
File "/home/xxx/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages/torch/nn/modules/module.py", line 743, in _save_to_state_dict
if buf is not None and name not in self._non_persistent_buffers_set:
File "/home/xxx/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages/torch/nn/modules/module.py", line 655, in __getattr__
type(self).__name__, name))
torch.nn.modules.module.ModuleAttributeError: 'BatchNorm2d' object has no attribute '_non_persistent_buffers_set'
CentOS-7 CUDA-10.0 GPU-T4
I don't think is specific to this repo, I would raise this over on the apex repo. You can also try one of our working environment:
To access an up-to-date working environment (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled), consider a:
@glenn-jocher I don't think has anything to do with NVIDIA APEX since I haven't installed it and I'm facing the same error. Using nvidia driver 450 and CUDA 10.2
It just happens when I use the --weights
parameter.
i was using torch-1.6.x (from git source), deleted it and installed torch-1.5.1 that solved.
This is a pytorch 1.6 problem. I'm seeing it also when using the official 1.6 today.
same issue at torch 1.6.0 with Ubuntu20.04
File "/media/frank/LinDB/PyProgram/ai/yolov5/yolov5-master0708/predict.py", line 39, in predict
model = attempt_load(weights, map_location=device) # load FP32 model
File "/media/frank/LinDB/PyProgram/ai/yolov5/yolov5-master0708/models/experimental.py", line 130, in attempt_load
model.append(torch.load(w, map_location=map_location)['model'].float().fuse().eval()) # load FP32 model
File "/media/frank/LinDB/PyProgram/ai/yolov5/yolov5-master0708/models/yolo.py", line 148, in fuse
m.conv = torch_utils.fuse_conv_and_bn(m.conv, m.bn) # update conv
File "/home/frank/miniconda3/envs/ai/lib/python3.7/site-packages/torch/nn/modules/module.py", line 802, in __setattr__
remove_from(self.__dict__, self._parameters, self._buffers, self._non_persistent_buffers_set)
File "/home/frank/miniconda3/envs/ai/lib/python3.7/site-packages/torch/nn/modules/module.py", line 772, in __getattr__
type(self).__name__, name))
torch.nn.modules.module.ModuleAttributeError: 'Conv' object has no attribute '_non_persistent_buffers_set'
After I reinstalled torch==1.5.0, the issue was Gone. Only this, but I got the result.
SourceChangeWarning: source code of class 'torch.nn.modules.conv.Conv2d' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
I have the same problem with torch=1.6.0
File "detect.py", line 23, in detect model = attempt_load(weights, map_location=device) # load FP32 model File "/home/gyhd/Desktop/yolov5/models/experimental.py", line 133, in attempt_load model.append(torch.load(w, map_location=map_location)['model'].float().fuse().eval()) # load FP32 model File "/home/gyhd/Desktop/yolov5/models/yolo.py", line 151, in fuse m.conv = torch_utils.fuse_conv_and_bn(m.conv, m.bn) # update conv File "/home/gyhd/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 802, in setattr remove_from(self.dict, self._parameters, self._buffers, self._non_persistent_buffers_set) File "/home/gyhd/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 772, in getattr type(self).name, name)) torch.nn.modules.module.ModuleAttributeError: 'Conv' object has no attribute '_non_persistent_buffers_set'
I do not want to reinstall torch==1.5 , I do not find torch.nn.Module.dump_patches = True
in torch/nn/modules/module.py ,
Could you tell me how should I retrieve the original source code to solve the problem clearly , thanks very much.
This repo is fully pytorch 1.6 compatible now. If you are having problems update your code and your models, as both have been updated for 1.6 compatibility.
To update your models simply delete any official pretrained weights you have and download the latest. If you have 1.5 trained custom models they will not work with 1.6.
The issue is raised in https://github.com/pytorch/pytorch/issues/42242 but this may be prevented by saving the model state dict only. Is there any reason to save the entire model as the checkpoint since those pytorch modules are subject to change?
BTW, is it possible to distribute the checkpoints with versioning using the github release? It seems like the checkpoint format is also changed in pytorch-1.6 as raised in https://github.com/pytorch/pytorch/issues/42239 being a zip.
@farleylai it's possible to do anything under the sun but with only 24 hours in a day I have to prioritize my time, so no unfortunately there won't be retroactive support for legacy versions, at least not from our end. If you'd like to take some action and submit a PR feel free to.
If anyone is having any problems with pytorch 1.6 or v2.0 of this repo, simply start from a clean slate, reclone the repo, autodownload any models you require and everything will work correctly.
@glenn-jocher
GitHub release can only be initiated on your side that supports uploading binary artifacts with versioning naturally, each of which can be up to 2GB. Moreover, the download would be more straightforward than GDrive since built-in APIs such as torch.hub.download_url_to_file()
and torch.hub.load_state_dict_from_url()
suffice. If you mean to rewrite APIs like attempt_download()
by version/tags and load/save the state_dict
to/from the checkpoint instead the entire model, I can manage to submit a PR and believe it should benefit in the long term.
@farleylai ah interesting. I was under the impression github would not host large files for free, and naturally we don't want to include weights in the main repo, as then every git clone would be far too large. I have not investigated this in a while though, perhaps things have changed?
Right now we host weights in Gdrive for most worldwide users, with a backup GCP bucket for China mainland users. attempt_download() will try one if the other fails, and also adds redundancy for everyone else outside China on occasional single-source failure.
If you'd like to take the lead on this though that would be awesome! At the moment I'm 110% in over my head simply doing research on model updates and maintaining origin/master in working order.
With respect to loading from state_dict, we use this approach in https://github.com/ultralytics/yolov3, but abandoned this. We required users to submit a pair of arguments for loading, a model cfg and a model *.pt file with state_dict weights. Too often users would mismatch the two, receive error messages and then raise bug reports, wasting everyone's time. Our new strategy is to emphasize design which makes things harder for users to break, because if there's a way they will find it. This means less settings, less knobs to turn, less arguments to submit when running things, etc.
@farleylai about the v1.0 weights themselves, I have copies of these I can send you. If you can replace attempt_download() functionality with the builtin torch functionality that would be great. The main requirements are:
I think that's it. Versioning control as you mention would be nice too of course. We don't currently include any version information in the weights themselves to keep things simple, but this does create confusion as you can see.
The git clone should be separate from the git repo to make sense: https://docs.github.com/en/github/managing-large-files/distributing-large-binaries.
It would be great if each time the released checkpoints can be associated with some git tag. So far, we have made copies on S3 and retrieve by version/tag for ease of comparing the differences. If this can be done at the level of your repo release/APIs, life could be made easier.
Regarding the model/state_dict matching, perhaps the version or some git hash tag including the config can be inserted into the checkpoint or even the state_dict as a definitive proof? Otherwise, external breaking from PyTorch may still happen in the future.
As for the PyTorch hub download APIs based on urllib
, additional redundancy and failure retry would be viewed as user's responsibility. Nonetheless, Adding simple exponential backoff with a retry max should be possible. I think you can make a first release for testing the download.
Other usability that can be enhanced could be the package organization and distribution that you may consider. Torch Hub APIs support exposing other entry points that would be useful for calling training/detection APIs programmatically other than just loading the model but exposing compiled cuda modules if any is unlikely since it merely downloads and extract the sources in the repo. Then a recipe to build a versioned conda distribution as PyTorch is likely necessary to help manage the dependencies and command line usage for training/detection/testing/etc. Perhaps, the top level package should be reorganized to something like yolov5
when exposed in the Python path.
BTW, since pytorch-1.6
has apex
integrated, that part of dependency may be removed soon?
@farleylai thanks buddy. Ok I think I understand better, we are really talking about two different things:
We are already doing the latter, but you are recommending we migrate to a different method. We have the weights already hosted at static URI in GCP buckets, so a transition to S3 would not gain us anything. The mnain problem I see from your explanation are the costs. Our current solution does not incur storage or egress charges for most users, which is a must as our download volumes are in excess of what we want to (are able to) support out of our own pockets in the long term. For the former you're saying we should start doing this. Can you add files to releases retroactively or is this something you're saying we should try to incorporate into v3.0?
Yes, AMP is great, it works very well. We have a PR open for this, I'm simply waiting on Google Colab to update their environment, as the change breaks 1.5.1 compatibility (the default colab pytorch currently). With the AMP release we'll update the requirements.txt to torch>=1.6.
@glenn-jocher Just played with the GitHub Release. It definitely supports editing past releases by adding/removing files to distribute as assets. Those release artifacts are NOT counted towards the repo storage usage nor cloned as the repo sources.
Perhaps, I did not make it clear. S3 is just for example that we do the versioning of the official checkpoints and our fine-tuned ones because so far the code simply downloads the latest one through the same link without an option to specify a tag or something. I am just proposing to add this option to the download API for ease of switching between different checkpoint versions. In that sense, the API may be enhanced to accept full download specs for supported cloud storage as extensions, not limited to GCP, S3 and so on but something the user can manage or host itself reliably.
I think I've added the v1.0 and v2.0 models successfully now to the release files :) See https://github.com/ultralytics/yolov5/releases/tag/v1.0
i just update pytorch from 1.5.1 to 1.6.0 and this error comes.
i was using torch-1.6.x (from git source), deleted it and installed torch-1.5.1 that solved.
@glenn-jocher when I trained my model in yolov5-v3.0, giving error is follow :
My envs is follow:
@shliang0603 it appears you may have environment problems. Please ensure you meet all dependency requirements if you are attempting to run YOLOv5 locally. If in doubt, create a new virtual Python 3.8 environment, clone the latest repo (code changes daily), and pip install -r requirements.txt
again. We also highly recommend using one of our verified environments below.
Python 3.8 or later with all requirements.txt dependencies installed, including torch>=1.6
. To install run:
$ pip install -r requirements.txt
YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are passing. These tests evaluate proper operation of basic YOLOv5 functionality, including training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu.
@glenn-jocher I don't think has anything to do with NVIDIA APEX since I haven't installed it and I'm facing the same error. Using nvidia driver 450 and CUDA 10.2
It just happens when I use the
--weights
parameter.
@glenn-jocher @sailfish009 Can you tell me your cuda version? My cuda is 10.2 and yolov5 V3.0, when I instll torch1.5.1 giving the error : AttributeError: module 'torch.nn' has no attribute 'Hardswish'
! When I installing torch1.6.0 will report error : torch.nn.modules.module.ModuleAttributeError: 'BatchNorm2d' object has no attribute '_non_persistent_buffers_set'
again! It's an endless cycle!I'm going to cry!
@shliang0603 _non_persistent_buffers_set error is well documented in the issues here. This error occurs when trying to load a pytorch 1.5.1 trained model with pytorch 1.6.
1.5.1 is no longer supported. cuda version varies by hardware, we use google colab and google cloud deep learning vms with 11.0 now. No issues on either.
The error has been resolved
: torch.nn.modules.module.ModuleAttributeError: 'BatchNorm2d' object has no attribute '_non_persistent_buffers_set'
[English Note]
1、First of all, let me explain my environment
Ubuntu
:18.04
Cuda
:10.2
Python
:3.8
Pytorch
:1.6.0
torchvision
:0.7.0
2、Training command
python train.py --img 640 --batch 16 --epochs 300 --data ./data/my_data.yaml --cfg ./models/yolov5l.yaml --weights ./weights/yolov5l.pt --device 1
3、Problem analysis
I have solved the problem I think this should not be pytorch1.6.0 version of the bug, I encountered this error, because I am using yolov5 v3.0 is used in the training model of yolov5l. pt is downloaded from yolov5 V1.0. Because in the process of the training model yolov5l. pt download is slow and I'm lazy, so I took yolov5 V1.0 download yolov5l. pt copied to yolov5 v3.0, so there is an error: :torch.nn.modules.module.ModuleAttributeError: 'BatchNorm2d' object has no attribute '_non_persistent_buffers_set'
4、Solution to error problem
Directly download the pre-training model of yolov5l.pt in yolov5 v3.0
[Chinese Note]
1、首先说明一下我的环境
Ubuntu
:18.04
Cuda
:10.2
Python
:3.8
Pytorch
:1.6.0
torchvision
:0.7.0
2、训练命令
python train.py --img 640 --batch 16 --epochs 300 --data ./data/my_data.yaml --cfg ./models/yolov5l.yaml --weights ./weights/yolov5l.pt --device 1
3、问题原因分析
我已经解决了这个问题。我认为这应该不是pytorch1.6.0版本的bug,我之所以遇到这个错误是因为,我在用yolov5 v3.0 中使用的预训练模型yolov5l.pt是yolov5 V1.0的中下载的,因为预训练的模型yolov5l.pt下载的比较慢,而我又比较懒,所以我把yolov5 v1.0中下载的yolov5l.pt拷贝到yolov5 v3.0中,因此出现错误:torch.nn.modules.module.ModuleAttributeError: 'BatchNorm2d' object has no attribute '_non_persistent_buffers_set'
4、解决方法
直接在yolov5 v3.0中重新下载一下yolov5l.pt的预训练模型
@shliang0603 _non_persistent_buffers_set error is well documented in the issues here. This error occurs when trying to load a pytorch 1.5.1 trained model with pytorch 1.6.
1.5.1 is no longer supported. cuda version varies by hardware, we use google colab and google cloud deep learning vms with 11.0 now. No issues on either.
@glenn-jocher Thanks for your reply, I have solved the problem.
@shliang0603 in principle you can simply add a _non_persistent_buffers_set
set to every YOLOv5 module to fix this problem, but I would simply recommend using the latest models instead.
for k, m in model.named_modules():
m._non_persistent_buffers_set = set() # pytorch 1.6.0 compatability
@glenn-jocher OK,Thanks.
Hi, I just made a git pull and I have the same issue here : "torch.nn.modules.module.ModuleAttributeError: 'BatchNorm2d' object has no attribute '_non_persistent_buffers_set'" without using pretrained weights (Ubuntu Bionic, torchvision 0.7.0 & torch 1.6.0) running the command line below. The cfg file is a copy of the one inside models folder with only the classes number adjusted. Everything worked well for weeks since today.
python3.6 train.py --epochs 1024 --batch-size 4 --data coco128.yaml --cfg yolov5s.yaml
Using CUDA device0 _CudaDeviceProperties(name='GeForce GTX 1650', total_memory=3910MB)
Namespace(adam=False, batch_size=4, bucket='', cache_images=False, cfg='yolov5s.yaml', data='coco128.yaml', device='', epochs=1024, evolve=False, global_rank=-1, hyp='data/hyp.scratch.yaml', image_weights=False, img_size=[640, 640], local_rank=-1, logdir='runs/', multi_scale=False, name='', noautoanchor=False, nosave=False, notest=False, rect=False, resume=False, single_cls=False, sync_bn=False, total_batch_size=4, weights='yolov5s.pt', workers=8, world_size=1)
Start Tensorboard with "tensorboard --logdir runs/", view at http://localhost:6006/
Hyperparameters {'lr0': 0.01, 'lrf': 0.2, 'momentum': 0.937, 'weight_decay': 0.0005, 'giou': 0.05, 'cls': 0.5, 'cls_pw': 1.0, 'obj': 1.0, 'obj_pw': 1.0, 'iou_t': 0.2, 'anchor_t': 4.0, 'fl_gamma': 0.0, 'hsv_h': 0.015, 'hsv_s': 0.7, 'hsv_v': 0.4, 'degrees': 0.0, 'translate': 0.1, 'scale': 0.5, 'shear': 0.0, 'perspective': 0.0, 'flipud': 0.0, 'fliplr': 0.5, 'mixup': 0.0}
/home/mirko/.local/lib/python3.6/site-packages/torch/serialization.py:649: SourceChangeWarning: source code of class 'models.yolo.Model' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
/home/mirko/.local/lib/python3.6/site-packages/torch/serialization.py:649: SourceChangeWarning: source code of class 'torch.nn.modules.container.Sequential' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
/home/mirko/.local/lib/python3.6/site-packages/torch/serialization.py:649: SourceChangeWarning: source code of class 'models.common.Focus' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
/home/mirko/.local/lib/python3.6/site-packages/torch/serialization.py:649: SourceChangeWarning: source code of class 'models.common.Conv' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
/home/mirko/.local/lib/python3.6/site-packages/torch/serialization.py:649: SourceChangeWarning: source code of class 'torch.nn.modules.conv.Conv2d' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
/home/mirko/.local/lib/python3.6/site-packages/torch/serialization.py:649: SourceChangeWarning: source code of class 'torch.nn.modules.batchnorm.BatchNorm2d' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
/home/mirko/.local/lib/python3.6/site-packages/torch/serialization.py:649: SourceChangeWarning: source code of class 'torch.nn.modules.activation.LeakyReLU' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
/home/mirko/.local/lib/python3.6/site-packages/torch/serialization.py:649: SourceChangeWarning: source code of class 'models.common.BottleneckCSP' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
/home/mirko/.local/lib/python3.6/site-packages/torch/serialization.py:649: SourceChangeWarning: source code of class 'torch.nn.modules.container.ModuleList' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
/home/mirko/.local/lib/python3.6/site-packages/torch/serialization.py:649: SourceChangeWarning: source code of class 'torch.nn.modules.pooling.MaxPool2d' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
/home/mirko/.local/lib/python3.6/site-packages/torch/serialization.py:649: SourceChangeWarning: source code of class 'torch.nn.modules.upsampling.Upsample' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
/home/mirko/.local/lib/python3.6/site-packages/torch/serialization.py:649: SourceChangeWarning: source code of class 'models.yolo.Detect' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
warnings.warn(msg, SourceChangeWarning)
from n params module arguments
0 -1 1 3520 models.common.Focus [3, 32, 3]
1 -1 1 18560 models.common.Conv [32, 64, 3, 2]
2 -1 1 19904 models.common.BottleneckCSP [64, 64, 1]
3 -1 1 73984 models.common.Conv [64, 128, 3, 2]
4 -1 1 161152 models.common.BottleneckCSP [128, 128, 3]
5 -1 1 295424 models.common.Conv [128, 256, 3, 2]
6 -1 1 641792 models.common.BottleneckCSP [256, 256, 3]
7 -1 1 1180672 models.common.Conv [256, 512, 3, 2]
8 -1 1 656896 models.common.SPP [512, 512, [5, 9, 13]]
9 -1 1 1248768 models.common.BottleneckCSP [512, 512, 1, False]
10 -1 1 131584 models.common.Conv [512, 256, 1, 1]
11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
12 [-1, 6] 1 0 models.common.Concat [1]
13 -1 1 378624 models.common.BottleneckCSP [512, 256, 1, False]
14 -1 1 33024 models.common.Conv [256, 128, 1, 1]
15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
16 [-1, 4] 1 0 models.common.Concat [1]
17 -1 1 95104 models.common.BottleneckCSP [256, 128, 1, False]
18 -1 1 147712 models.common.Conv [128, 128, 3, 2]
19 [-1, 14] 1 0 models.common.Concat [1]
20 -1 1 313088 models.common.BottleneckCSP [256, 256, 1, False]
21 -1 1 590336 models.common.Conv [256, 256, 3, 2]
22 [-1, 10] 1 0 models.common.Concat [1]
23 -1 1 1248768 models.common.BottleneckCSP [512, 512, 1, False]
24 [17, 20, 23] 1 18879 models.yolo.Detect [2, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]]
Model Summary: 191 layers, 7.25779e+06 parameters, 7.25779e+06 gradients
Traceback (most recent call last):
File "train.py", line 456, in <module>
train(hyp, opt, device, tb_writer)
File "train.py", line 75, in train
state_dict = ckpt['model'].float().state_dict() # to FP32
File "/home/mirko/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 900, in state_dict
module.state_dict(destination, prefix + name + '.', keep_vars=keep_vars)
File "/home/mirko/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 900, in state_dict
module.state_dict(destination, prefix + name + '.', keep_vars=keep_vars)
File "/home/mirko/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 900, in state_dict
module.state_dict(destination, prefix + name + '.', keep_vars=keep_vars)
[Previous line repeated 1 more time]
File "/home/mirko/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 897, in state_dict
self._save_to_state_dict(destination, prefix, keep_vars)
File "/home/mirko/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 860, in _save_to_state_dict
if buf is not None and name not in self._non_persistent_buffers_set:
File "/home/mirko/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 772, in __getattr__
type(self).__name__, name))
torch.nn.modules.module.ModuleAttributeError: 'BatchNorm2d' object has no attribute '_non_persistent_buffers_set'
OK got it, if you do not want to use weights you have to type --weights '' otherwise it will load yolov5s.pt
HTH
I have this issue when using pytorch 1.7 built from source. any ideas on how to workaround if I need it to work with pytorch 1.7?
Using python 3.7, 32 bit arm
@bhaktatejas922 this will occur when using older models, i.e. trained with the v2.0 YOLOv5 release or torch<1.6. Train and export with the latest master:
git clone https://github.com/ultralytics/yolov5
I have same issue, I'm using recommended environment, updated version, but still have same issue
I copy below try to train, got same issue python train.py --data data/smoke.yaml --cfg models/yolov5s.yaml --weights weights/yolov5s.pt --batch-size 16 --epochs 100
using below without pre-trained weights, it will download the pre-trained weight, it's working now. python train.py --data data/smoke.yaml --cfg models/yolov5s.yaml --batch-size 16 --epochs 100
I think it's not the Pytorch bug, but it's the pre-trained weights not compatible with the train model.
anyway, problem sort, I can run the training.
Thanks
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
pip install torch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 -i https://pypi.tuna.tsinghua.edu.cn/simple --default-timeout=1000 problems gone!
Bug After Install NVIDIA APEX
🐛 Bug
After I install NVIDIA APEX i got this error:
Traceback (most recent call last): File "train.py", line 397, in <module> train(hyp) File "train.py", line 116, in train {k: v for k, v in ckpt['model'].state_dict().items() if model.state_dict()[k].numel() == v.numel()} File "/home/hitham/anaconda3/envs/env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 735, in state_dict module.state_dict(destination, prefix + name + '.', keep_vars=keep_vars) File "/home/hitham/anaconda3/envs/env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 735, in state_dict module.state_dict(destination, prefix + name + '.', keep_vars=keep_vars) File "/home/hitham/anaconda3/envs/env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 735, in state_dict module.state_dict(destination, prefix + name + '.', keep_vars=keep_vars) [Previous line repeated 1 more time] File "/home/hitham/anaconda3/envs/env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 732, in state_dict self._save_to_state_dict(destination, prefix, keep_vars) File "/home/hitham/anaconda3/envs/env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 709, in _save_to_state_dict if buf is not None and name not in self._non_persistent_buffers_set: File "/home/hitham/anaconda3/envs/env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 621, in __getattr__ type(self).__name__, name)) torch.nn.modules.module.ModuleAttributeError: 'BatchNorm2d' object has no attribute '_non_persistent_buffers_set'
But before APEX the code was running smoothly, im using pytorch: 1.6.0.dev20200611 I've tried 1.4 and 1.5 but not working at all
Environment