yolov4-custom cfg is not parsed correctly using the provided parser

Fetulhak commented 3 years ago

@hhk7734 @eunjins @JayantGoel001 Hi I have a weight file trained using yolov4-custom.cfg file and I want to use that file during the model development but the cfg-parser is not correctly parsing the file. does it only work for the cfg files provided in your repo?

JayantGoel001 commented 3 years ago

@hhk7734 @eunjins @JayantGoel001 Hi I have a weight file trained using yolov4-custom.cfg file and I want to use that file during the model development but the cfg-parser is not correctly parsing the file. does it only work for the cfg files provided in your repo?

@Fetulhak Actually the issue you are facing is because the original cfg also contains code for training parameters but tensorflow-yolov4 doesn't support training. So If you just remove the training parameters from your yolov4-custom you can get it working.🙂

Fetulhak commented 3 years ago

@JayantGoel001 thanks for your replay. here is the yolov4-custom cfg file. which layers should I have to remove? yolo-obj.txt

JayantGoel001 commented 3 years ago

@JayantGoel001 thanks for your replay. here is the yolov4-custom cfg file. which layers should I have to remove? yolo-obj.txt

Can you please tell me which model of yolo you are using?

Fetulhak commented 3 years ago

I am using YOLOv4 with initial weight of yolov4.conv.137. with CSPDarkNet53 backbone

JayantGoel001 commented 3 years ago

I am using YOLOv4 with initial weight of yolov4.conv.137. with CSPDarkNet53 backbone

[net]
#use_cuda_graph = 1
# Testing
#batch=1
#subdivisions=1
# Training
batch=64
width=416
height=416
channels=3
momentum=0.949

learning_rate=0.001
burn_in=1000
max_batches = 2000
policy=steps
steps=1600,1800
scales=.1,.1

#cutmix=1
mosaic=1

#:104x104 54:52x52 85:26x26 104:13x13 for 416

[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=mish

# Downsample

[convolutional]
batch_normalize=1
filters=64
size=3
stride=2
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=mish

[route]
layers = -2

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=mish

[route]
layers = -1,-7

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=mish

# Downsample

[convolutional]
batch_normalize=1
filters=128
size=3
stride=2
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=mish

[route]
layers = -2

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=mish

[route]
layers = -1,-10

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish

# Downsample

[convolutional]
batch_normalize=1
filters=256
size=3
stride=2
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish

[route]
layers = -2

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish

[route]
layers = -1,-28

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish

# Downsample

[convolutional]
batch_normalize=1
filters=512
size=3
stride=2
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish

[route]
layers = -2

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish

[route]
layers = -1,-28

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=mish

# Downsample

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=2
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=mish

[route]
layers = -2

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=mish

[route]
layers = -1,-16

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=mish

##########################

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

### SPP ###
[maxpool]
stride=1
size=5

[route]
layers=-2

[maxpool]
stride=1
size=9

[route]
layers=-4

[maxpool]
stride=1
size=13

[route]
layers=-1,-3,-5,-6
### End SPP ###

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[upsample]
stride=2

[route]
layers = 85

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[route]
layers = -1, -3

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[upsample]
stride=4

[route]
layers = 23

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[route]
layers = -1, -3

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

##########################

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=18
activation=linear

[yolo]
mask = 0,1,2
anchors = 40,40,  40,40, 40,40, 40,40, 40,40, 40,40, 40,40, 40,40, 40,40
classes=1
num=9
scale_x_y = 1.2
iou_thresh=0.213
cls_normalizer=1.0
iou_normalizer=0.07
iou_loss=ciou
nms_kind=greedynms
beta_nms=0.6

[route]
layers = -4

[convolutional]
batch_normalize=1
size=3
stride=4
pad=1
filters=256
activation=leaky

[route]
layers = -1, -16

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=18
activation=linear

[yolo]
mask = 3,4,5
anchors = 40,40,  40,40, 40,40, 40,40, 40,40, 40,40, 40,40, 40,40, 40,40
classes=1
num=9
scale_x_y = 1.1
iou_thresh=0.213
cls_normalizer=1.0
iou_normalizer=0.07
iou_loss=ciou
nms_kind=greedynms
beta_nms=0.6

[route]
layers = -4

[convolutional]
batch_normalize=1
size=3
stride=2
pad=1
filters=512
activation=leaky

[route]
layers = -1, -37

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=18
activation=linear

[yolo]
mask = 6,7,8
anchors = 40,40,  40,40, 40,40, 40,40, 40,40, 40,40, 40,40, 40,40, 40,40
classes=1
num=9
scale_x_y = 1.05
iou_thresh=0.213
cls_normalizer=1.0
iou_normalizer=0.07
iou_loss=ciou
nms_kind=greedynms
beta_nms=0.6

Try This once.

Fetulhak commented 3 years ago

Thank you very much let me load the model using the modified cfg file.

Fetulhak commented 3 years ago

Thank you very much let me load the model using the modified cfg file.

Fetulhak commented 3 years ago

@JayantGoel001 the parser works fine now but the model is not accepting the input shape.

def make_model(self):
        K.clear_session()

        _input = Input(self.config.net.input_shape)

        self._model = YOLOv4Model(config=self.config)

        self._model(_input)        #error execution  of this line of code, any idea

Fetulhak commented 3 years ago

https://drive.google.com/file/d/1Jz35Sln3FPjMNH7kYzTuzv0lyiyECM53/view?usp=sharing

link to my pre-trained weight.

JayantGoel001 commented 3 years ago

https://drive.google.com/file/d/1Jz35Sln3FPjMNH7kYzTuzv0lyiyECM53/view?usp=sharing

link to my pre-trained weight.

@Fetulhak Provide the access of the drive link

Fetulhak commented 3 years ago

@JayantGoel001 did you access it?

https://drive.google.com/file/d/1Jz35Sln3FPjMNH7kYzTuzv0lyiyECM53/view?usp=sharing

JayantGoel001 commented 3 years ago

@JayantGoel001 did you access it?

https://drive.google.com/file/d/1Jz35Sln3FPjMNH7kYzTuzv0lyiyECM53/view?usp=sharing

Yes I just accessed it.

I think CFG You used for training is wrong. 1 . In your yolov4-custom.cfg file on line number 745 there is stopbackward=800 which is not part of yolov4.cfg found at AlexeyAB repo https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/yolov4.cfg

In line 891 & 992, you have updated the value of stride from 2 to 4. but In training tutorial of AI guy there is no mention of that https://colab.research.google.com/drive/1_GdoqCJWXsChrOiY8sZMr_zbr_fH-0Fg

Same goes for layer on line number 894.
You have all values of anchor equals to 40.

I recommend you try to follow this tutorial to create custom CFG file. https://colab.research.google.com/drive/1_GdoqCJWXsChrOiY8sZMr_zbr_fH-0Fg

Then Train the model. 🙂

Fetulhak commented 3 years ago

@JayantGoel001 Both the cfg file and my weight files work fine when I use AlexeyAB darknet command for inference !./darknet detector test build/darknet/x64/data/trainer.data build/darknet/x64/cfg/yolo-obj.cfg build/darknet/x64/backup/yolo-obj_2000.weights -thresh 0.075 -iou_thresh 0.3 build/darknet/x64/data/img/20170607_163916.jpg

Fetulhak commented 3 years ago

@JayantGoel001 as alexeyab said we can change some of the values inorder to detect small objects and that is why I changed the strides and anchor box sizes.

Fetulhak commented 3 years ago

@JayantGoel001 is that possible for your class in class UpsampleLayer(BaseLayer): to accept upsample values with stride 4. I think the problem is that class it accepts only stride of 2

Fetulhak commented 3 years ago

def __repr__(self) -> str:
    rep = f"{self.index:4}  "
    rep += f"{self.type[:5]}_"
    rep += f"{self.type_index:<3}                   "
    rep += f"{self.stride:2}x    " #this line of code
    rep += f"{self.input_shape[0]:4} "
    rep += f"x{self.input_shape[1]:4} "
    rep += f"x{self.input_shape[2]:4} -> "
    rep += f"{self.output_shape[0]:4} "
    rep += f"x{self.output_shape[1]:4} "
    rep += f"x{self.output_shape[2]:4}  "
    return rep

JayantGoel001 commented 3 years ago

@JayantGoel001 Both the cfg file and my weight files work fine when I use AlexeyAB darknet command for inference !./darknet detector test build/darknet/x64/data/trainer.data build/darknet/x64/cfg/yolo-obj.cfg build/darknet/x64/backup/yolo-obj_2000.weights -thresh 0.075 -iou_thresh 0.3 build/darknet/x64/data/img/20170607_163916.jpg

Ya, I checked that too but the thing is I don't think a changing stride is a good option because in the training tutorial of darknet there is no mention of that.

JayantGoel001 commented 3 years ago

def __repr__(self) -> str:
    rep = f"{self.index:4}  "
    rep += f"{self.type[:5]}_"
    rep += f"{self.type_index:<3}                   "
    rep += f"{self.stride:2}x    " #this line of code
    rep += f"{self.input_shape[0]:4} "
    rep += f"x{self.input_shape[1]:4} "
    rep += f"x{self.input_shape[2]:4} -> "
    rep += f"{self.output_shape[0]:4} "
    rep += f"x{self.output_shape[1]:4} "
    rep += f"x{self.output_shape[2]:4}  "
    return rep

If you want to do it You should just clone this repo and make changes that work for you and use it in your project. 🙂

Fetulhak commented 3 years ago

@JayantGoel001 I have solved the issue related with modified yolov4 config file by modifying the UpSampleLayer as follows. In my case, I have modified the feature concatenation layers after upsampling by modifying the strides

from tensorflow.keras.layers import UpSampling2D

class UpsampleLayer(UpSampling2D):
    def __init__(self, metalayer, metanet): 
      if metalayer.stride == 2:
        super().__init__(interpolation="bilinear", name=metalayer.name)
        self.metalayer = metalayer
        self.metanet = metanet

      else:
        super().__init__(size=(4, 4),interpolation="bilinear", name=metalayer.name)
        self.metalayer = metalayer
        self.metanet = metanet

Fetulhak commented 3 years ago

@JayantGoel001 I got into trouble loading the weight file

/usr/local/lib/python3.7/dist-packages/yolov4/tf/utils/weights.py in _np_fromfile(fd, dtype, count)
     44         if len(data) == 0:
     45             return None
---> 46         raise ValueError("Model and weights file do not match.")
     47     return data
     48 

ValueError: Model and weights file do not match.

JayantGoel001 commented 3 years ago

@JayantGoel001 I got into trouble loading the weight file

/usr/local/lib/python3.7/dist-packages/yolov4/tf/utils/weights.py in _np_fromfile(fd, dtype, count)
     44         if len(data) == 0:
     45             return None
---> 46         raise ValueError("Model and weights file do not match.")
     47     return data
     48 

ValueError: Model and weights file do not match.

Looks like cfg file and weight file are not same🤔

Fetulhak commented 3 years ago

@JayantGoel001 Thanks it was my mistake which I upload wrong weight file.

Fetulhak commented 3 years ago

@JayantGoel001 now i have got in to another problem. When I use AlexeyAb darknet detector test command and when I use your model.inference command for the same prob_thresh=0.075 I got different prediction results. I tried a lot what the problem will be but I cannot able to solve. I have used my custom weight file trained using CSP darknet. Here are my results

Original image: 20170612_165806

predicted by AlexeyAB's repo: predictions

JayantGoel001 commented 3 years ago

@JayantGoel001 now i have got in to another problem. When I use AlexeyAb darknet detector test command and when I use your model.inference command for the same prob_thresh=0.075 I got different prediction results. I tried a lot what the problem will be but I cannot able to solve. I have used my custom weight file trained using CSP darknet. Here are my results

Original image:

predicted by AlexeyAB's repo:

Hello @Fetulhak I think there is some difference in the accuracy of TensorFlow-yolov4 and darknet yolov4 Check this out for a better understanding. 🙂 https://github.com/hunglc007/tensorflow-yolov4-tflite/issues/165

hhk7734 / tensorflow-yolov4

yolov4-custom cfg is not parsed correctly using the provided parser #89