Open LukeAI opened 3 years ago
both of .pt and .weights could be pretrained weights.
modify the pretrained = pretrained = weights.endswith('.pt') or pretrained = weights.endswith('.weights')
.
and add code for loading .weights file as pretrained weights.
for your reference: https://github.com/WongKinYiu/ScaledYOLOv4/blob/yolov4-csp/detect.py#L44-L48
thanks for your response! if I add this code to make it work with .weights as well as .pt would you accept a PR?
here I'd like to add the right cutoff for yolov4-csp.conv.142
- I guess it should be 143?
Would a good generic rule be to set cutoff=X+1
for weights.conv.X
and cutoff=X
for weights-tiny.conv.X
?
def load_darknet_weights(self, weights, cutoff=-1):
# Parses and loads the weights stored in 'weights'
# Establish cutoffs (load layers between 0 and cutoff. if cutoff = -1 all are loaded)
file = Path(weights).name
if file == 'darknet53.conv.74':
cutoff = 75
elif file == 'yolov3-tiny.conv.15':
cutoff = 15
@LukeAI hey are u using .pt or .weights for pretrained ?
I actually ended up using scaled-yolo
@LukeAI but Did you use pretrained weights? If so what version .weights or .pt?
@WongKinYiu
I'm trying to work out how I can train yolov4-csp on a custom dataset starting with coco pretrained convolutional layers.
1) Does training yolov4-csp with this repository require pytorch format
.pt
weights?2) AlexeyAB provides coco-pretrained partial weights for transfer learning https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-csp.conv.142 Is it possible to convert darknet format partial weights to
.pt
? If so, how?3) If not, are pretrained weights in
.pt
available?I tried training like this:
python train.py --device 0 --batch-size 14 --data data/custom.yaml --cfg models/yolov4-csp-custom.cfg --weights 'weights/yolov4-csp.conv.142' --name yolov4-csp
but the training curve looks the same as training from scratch, and https://github.com/WongKinYiu/ScaledYOLOv4/blob/4dfcec67f8b7db4893ed66000dd1b317691373a4/train.py#L60 it appears that only
.pt
weights can be used to initialise the model for training.