WongKinYiu / ScaledYOLOv4

Scaled-YOLOv4: Scaling Cross Stage Partial Network
GNU General Public License v3.0
2.02k stars 571 forks source link

Transfer Learning #328

Closed sard0r closed 2 years ago

sard0r commented 2 years ago

I want to apply Scaled YOLOv4-p7. pt pretrained weight to train my own custom data. Which layer I have to freeze and which one should I leave to get a good accuracy (over 80%). Is there any tutorial for that.

I applied fine tuning but the result is very bad so I want to make transfer learning.

Thank you!

WongKinYiu commented 2 years ago

freeze all layers except for No. 36, 40, 44, 48, 52 and 53 layers.

sard0r commented 2 years ago

@WongKinYiu I appreciate your incredible work! Thanks.

I am a beginner in deep learning and came across to your Network. I know how to write nn.Sequential and nn.Functional. But when I read your code I did not see that lines.

If it is possible can you explain which file and lines to freeze. And how to freeze it using code. I think it is gonna be useful for others too as your network is the state of the art model at this time.

Thank you for the support and the work you have done!

WongKinYiu commented 2 years ago

my office has power cut until tomorrow night, will examine the code after tomorrow.

sard0r commented 2 years ago

@WongKinYiu Appreciate it!

LeeSungl commented 2 years ago

@WongKinYiu

I am having the same problem as @sard0r. It would be better if you provided the detailed answer!

Thanks

WongKinYiu commented 2 years ago

replace https://github.com/WongKinYiu/ScaledYOLOv4/blob/yolov4-large/train.py#L75-L88

    # Optimizer
    nbs = 64  # nominal batch size
    accumulate = max(round(nbs / total_batch_size), 1)  # accumulate loss before optimizing
    hyp['weight_decay'] *= total_batch_size * accumulate / nbs  # scale weight_decay

    pg0, pg1, pg2 = [], [], []  # optimizer parameter groups
    for k, v in model.named_parameters():
        v.requires_grad = True
        if '.bias' in k:
            pg2.append(v)  # biases
        elif '.weight' in k and '.bn' not in k:
            pg1.append(v)  # apply weight decay
        else:
            pg0.append(v)  # all else

by

    nofreeze = [f'model.{x}.' for x in [36, 40, 44, 48, 52, 53]]  # parameter names to nofreeze (full or partial)
    for k, v in model.named_parameters():
        v.requires_grad = False  # freeze all layers
        if any(x in k for x in nofreeze):
            print('training %s' % k)
            v.requires_grad = True

    # Optimizer
    nbs = 64  # nominal batch size
    accumulate = max(round(nbs / total_batch_size), 1)  # accumulate loss before optimizing
    hyp['weight_decay'] *= total_batch_size * accumulate / nbs  # scale weight_decay

    pg0, pg1, pg2 = [], [], []  # optimizer parameter groups
    for k, v in model.named_parameters():
        if '.bias' in k:
            pg2.append(v)  # biases
        elif '.weight' in k and '.bn' not in k:
            pg1.append(v)  # apply weight decay
        else:
            pg0.append(v)  # all else
sard0r commented 2 years ago

@WongKinYiu Thank you very much!

Also I am running this command to do transfer learning. python train.py --batch 16 --weights weights/yolov4-p7.pt --data data.yaml --epochs 200 --cache --img 1536 --hyp hyp.finetune.yaml

Here data.yaml is for my own custom dataset.

Is that correct?

WongKinYiu commented 2 years ago

i do not provide the hyp.yaml for custom dataset due to every custom dataset have different properties.

LeeSungl commented 2 years ago

@WongKinYiu I am sorry to disturb you again.

My custom data has 9 classes and 7 of them (person, bicycle, car, motorcycle, bus, traffic light, truck) are similar with the Coco dataset. I freezed the layers as you have instructed above. But the result is really low about mAP 35%. I think if the transfer learning is done well I should have got more than 70% mAP.

Also, I wanted to ask the layers you mentioned above are they head of the model? In, addition I changed the learning rate several times but the results gotten worse.

What do you advise for me to do?

Once again, I appreciate your wonderful work!

Thanks

WongKinYiu commented 2 years ago

how many epochs you trained? and could you show the results.txt?

LeeSungl commented 2 years ago

I was training 100 epochs but after 10 epochs the results become flattened. I am attaching the results.txt here results.txt

WongKinYiu commented 2 years ago

could you also show your train.py?

LeeSungl commented 2 years ago

@WongKinYiu python train.py --batch 16 --weights weights/yolov4-p7.pt --data data.yaml --epochs 100 --cache --img 1536 --hyp hyp.finetune.yaml

WongKinYiu commented 2 years ago

oh, i mean your modified train.py.

LeeSungl commented 2 years ago

I see here I am attaching. Screenshot from 2021-09-16 04-57-25

WongKinYiu commented 2 years ago

the loss are keep going down rapidly, so i guess your dataset is small (maybe less than thousand), and you need more epochs for training. the loss value of training data is low but the ap of validation data is not good, do you think the training data and validation data are in the similar domain? by the way, you could try to train with python train.py --batch 16 --weights weights/yolov4-p7.pt --data data.yaml --epochs 100 --cache --img 1536 --hyp hyp.finetune.yaml --noautoanchor

LeeSungl commented 2 years ago

Yes the training data is about 1000 images and the validation data is about 250.

Training and Validation data are same almost.

Thank you for your suggestions.

Should I get more data?

WongKinYiu commented 2 years ago

in my experiments, change your label id to fit coco label id will get the best results. first, convert 7 same classes id to corresponding coco classes id. second, convert 2 different classes id to the most similar coco classes id but not includes above 7 classes. final, use yolov4-p7.yaml of coco dataset to train your dataset with --noautoanchor.

LeeSungl commented 2 years ago

@WongKinYiu

Thanks for the comments so far. Can you elaborate on matching the class id? Does it mean to have same class names. If not how can I do that?

WongKinYiu commented 2 years ago

for example, person classes id is 0 in coco.yaml, so in your label txt files, if a bounding box is belong to a person, set its class id to 0.

sard0r commented 2 years ago

@WongKinYiu

Thank you very much for all the info.

I really wanted to know why layer No. 36, 40, 44, 48, 52 and 53 are vital for transfer learning. Are they head of the network or is there any specific reason for that?

joeyism commented 2 years ago

@WongKinYiu

Thank you very much for all the info.

I really wanted to know why layer No. 36, 40, 44, 48, 52 and 53 are vital for transfer learning. Are they head of the network or is there any specific reason for that?

Is it because it is the detection heads, as specified in yolov4-p7.yaml?

joeyism commented 2 years ago

@LeeSungl did you get it to train well? I'm at the same part as you were, where it flattens after 10 epochs with mAP at around 35% too.