Closed Michaelzeyong closed 4 years ago
Hello @Michaelzeyong, thank you for your interest in our work! Please visit our Custom Training Tutorial to get started, and see our Jupyter Notebook , Docker Image, and Google Cloud Quickstart Guide for example environments.
If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.
If this is a custom model or data training question, please note Ultralytics does not provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as:
For more information please visit https://www.ultralytics.com.
Lower the batch size
Lower the batch size The error occured in model = Model(opt.cfg, nc=nc).to(device). So batchsize is not the reason. It attempt to use a very large conv kernel.
@Michaelzeyong it appears you may have environment problems. Please ensure you meet all dependency requirements if you are attempting to run YOLOv5 locally. If in doubt, create a new virtual Python 3.8 environment, clone the latest repo (code changes daily), and pip install -r requirements.txt
again. We also highly recommend using one of our verified environments below.
Python 3.8 or later with all requirements.txt dependencies installed, including torch>=1.6
. To install run:
$ pip install -r requirements.txt
YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are passing. These tests evaluate proper operation of basic YOLOv5 functionality, including training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
❔Question
When i run phthon python train.py --data ./data/coco.yaml --cfg yolov5s.yaml --weights yolov5s.pt --batch-size 64, the error occured in model = Model(opt.cfg, nc=nc).to(device) RuntimeError: [enforce fail at CPUAllocator.cpp:64] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 137438953472 bytes. Error code 12. I checked the code, and I foud it attempt to create a nn.conv2d with kernel size 512*512.
I add a print in the following code: class Conv(nn.Module):
Standard convolution
print reslut as follow:
from n params module arguments
kernel 12 32 32 1 1 0 -1 1 393280 models.common.Focus [3, 32, 32]
kernel 32 64 64 1 1 1 -1 1 8388736 models.common.Conv [32, 64, 64]
kernel 64 32 1 1 1 kernel 64 64 1 1 1 kernel 32 32 1 1 1 kernel 32 32 3 1 1 2 -1 1 19904 models.common.BottleneckCSP [64, 64, 1, 64]
kernel 64 128 128 1 1 3 -1 1 134217984 models.common.Conv [64, 128, 128]
kernel 128 64 1 1 1 kernel 128 128 1 1 1 kernel 64 64 1 1 1 kernel 64 64 3 1 1 kernel 64 64 1 1 1 kernel 64 64 3 1 1 kernel 64 64 1 1 1 kernel 64 64 3 1 1 4 -1 1 161152 models.common.BottleneckCSP [128, 128, 3, 128]
kernel 128 256 256 1 1 5 -1 12147484160 models.common.Conv [128, 256, 256]
kernel 256 128 1 1 1 kernel 256 256 1 1 1 kernel 128 128 1 1 1 kernel 128 128 3 1 1 kernel 128 128 1 1 1 kernel 128 128 3 1 1 kernel 128 128 1 1 1 kernel 128 128 3 1 1 6 -1 1 641792 models.common.BottleneckCSP [256, 256, 3, 256]
kernel 256 512 512 1 1
Why kernel size is 32 or 512, it is too large. And i foud it is diffrent with pretraied model yolov5s.pt.
Additional context