Open MahBadran93 opened 3 years ago
I recommend this blog.
Hi MahBadran93,
I have a question. Does mask rcnn not adjust its weights and learning on the basis of validation dataset after each epoch. Like I have a a dataset divided into train, val and test. Train and val are supplied for training. And if I run the the model on validation dataset, the results are quite poor itself, let alone test dataset. This means validation dataset is not used for training? Just for us to check our val score while training is going on?
The validation set is used to validate training.
After each step, the current model is tested on the validation set. This test determines if the last training improved the model or not. So the validation set is not explicitly used to train the model, but it used in training if that makes sense.
@TimNagle-McNaughton, thank you for your reply. So if my validation score is not improving, does the training model learn that and adjust its weights? That would mean it learns on both train and val dataset and if that is so, the resultant model should not perform that poorly on val dataset. Am I getting it correctly? My validation score doesn't improve after 40 epochs itself and the trained model is unable to segment most of the objects in validation/test datasets. Any ideas on how to improve training.
I tried something. I wanted to retrain all layers of the backbone network on my custom dataset. For which I set TRAIN_BN = True
in config.py
. Am I correct here? Will this mean no layer would be frozen while training?
So if my validation score is not improving, does the training model learn that and adjust its weights?
Broadly, yes.
the resultant model should not perform that poorly on val dataset
Correct.
For which I set TRAIN_BN = True
I'm not familiar with that flag sorry.
the resultant model should not perform that poorly on val dataset
Correct.
I guess my trained model is not efficient then because it is in fact performing poorly on val set. Thanks anyway @TimNagle-McNaughton
- It seems pretty obvious to me that your model is immediately overfitting. Your validation loss is almost double your training loss immediately. I would think that the learning rate may be too high, and would try reducing it.
I recommend this blog.
- mAP will vary based on your threshold and IoU. Try reducing the threshold and visualize some results to see if that's better.
- Your validation loss is varying wildly because your validation set is likely not representative of the whole dataset. I would recommend shuffling/resampling the validation set, or using a larger validation fraction.
Thank you @TimNagle-McNaughton for your answer.
Hi MahBadran93,
- It showed some sort of overfitting. because if you draw a line best fit the val loss, it is going down and then going up while your train loss keeps going down.
- It also showed signs of the training dataset maybe not representative enough, and the model didn't learn enough to perform the task. make sure that you feed the right images to your model.
You are right, the dataset was not representative enough and that was the main issue.
Hello I am facing this same problem. Based on previous answers i have adjusted my data split. I have used a 80-20(original split), tried 90-10 and 70-30, but i get the same result, epoch_loss
looks awesome but validation_loss
keeps fluctuating.
I am only training heads
, no matter the epoch amount, fluctuate.
Reading elsewhere said that a possible cause could be my model is too complex but that argument does not fit here i think.
This is the dataset i am using https://github.com/dsmlr/Car-Parts-Segmentation/
id appreciate any advice where to continue looking.
BACKBONE | resnet101 |
BACKBONE_STRIDES | [4, 8, 16, 32, 64] |
BATCH_SIZE | 1 |
BBOX_STD_DEV | [0.1 0.1 0.2 0.2] |
BBOX_STD_DEV | [0.1 0.1 0.2 0.2] |
COMPUTE_BACKBONE_SHAPE | None |
DETECTION_MAX_INSTANCES | 35 |
DETECTION_MIN_CONFIDENCE | 0.7 |
DETECTION_NMS_THRESHOLD | 0.3 |
FPN_CLASSIF_FC_LAYERS_SIZE | 1024 |
GPU_COUNT | 1 |
GRADIENT_CLIP_NORM | 5.0 |
IMAGES_PER_GPU | 1 |
IMAGE_CHANNEL_COUNT | 3 |
IMAGE_MAX_DIM | 512 |
IMAGE_META_SIZE | 32 |
IMAGE_MIN_DIM | 512 |
IMAGE_MIN_SCALE | 0 |
IMAGE_RESIZE_MODE | square |
IMAGE_SHAPE | [512 512 3] |
LEARNING_MOMENTUM | 0.9 |
LEARNING_RATE | 0.001 |
LOSS_WEIGHTS | {'rpn_class_loss': 1.0, 'rpn_bbox_loss': 1.0, 'mrcnn_class_loss': 1.0, 'mrcnn_bbox_loss': 1.0, 'mrcnn_mask_loss': 1.0} |
MASK_POOL_SIZE | 14 |
MASK_SHAPE | [28, 28] |
MAX_GT_INSTANCES | 100 |
MEAN_PIXEL | [123.7 116.8 103.9] |
MINI_MASK_SHAPE | (56, 56) |
NAME | car_parts |
NUM_CLASSES | 20 |
POOL_SIZE | 7 |
POST_NMS_ROIS_INFERENCE | 1000 |
POST_NMS_ROIS_TRAINING | 2000 |
PRE_NMS_LIMIT | 6000 |
ROI_POSITIVE_RATIO | 0.33 |
RPN_ANCHOR_RATIOS | [0.5, 1, 2] |
RPN_ANCHOR_SCALES | (32, 64, 128, 256, 512) |
RPN_ANCHOR_STRIDE | 1 |
RPN_BBOX_STD_DEV | [0.1 0.1 0.2 0.2] |
RPN_NMS_THRESHOLD | 0.7 |
RPN_TRAIN_ANCHORS_PER_IMAGE | 256 |
STEPS_PER_EPOCH | 500 |
TOP_DOWN_PYRAMID_SIZE | 256 |
TRAIN_BN | False |
TRAIN_ROIS_PER_IMAGE | 200 |
USE_MINI_MASK | False |
USE_RPN_ROIS | True |
VALIDATION_STEPS | 100 |
WEIGHT_DECAY | 0.0001 |
UPDATE: It was fluctuating because my Dataset already has a background annotation
. When creating my custom Dataset, this created two background
classes resulting in problems when training. Now my training is not fluctuating any more.
I got these My dataset has imbalance problem but is it only this reason or something else? Network1: Network 2
Hello , I meet this problem too. Can you tell me how to solve this problem? Thanks!
I couldn't come to any conclusion.
You need to solve the data imbalance problem. It can be the main reason for the bad results. You want to make sure that you have an equal distribution for each class across train, val and test. You can try augmentation.
I got these My dataset has imbalance problem but is it only this reason or something else? Network1: Network 2
Hello , I meet this problem too. Can you tell me how to solve this problem? Thanks!
I couldn't come to any conclusion.
You need to solve the data imbalance problem. It can be the main reason for the bad results. You want to make sure that you have an equal distribution for each class across train, val and test. You can try augmentation.
I tried data augmentation but Alexnet pretrained showed skipped class prediction in classification report nd accuracy is very low. I did for mnist dataset it gave 98% but for ecg dataset it was 48% and my classification report shows few classes precision/recall 0
I got these My dataset has imbalance problem but is it only this reason or something else? Network1: Network 2
Hello , I meet this problem too. Can you tell me how to solve this problem? Thanks!
hi guys i m facing the same issue. here is my advice
I hope it was helpful.
Hi all,
I am using this maskrcnn library to do detection and segmentation. I have this class distribution: Class_Occurrences = { 0:189 , 1:22, 2:1, 3:40, 4:28, 5:85, 6:40, 7:63, 8:42, 9:5 } key: class_id, value: number of occurrences. First class with key 0 is the background.
Data set contains 189 training images and 53 validation images.
augmentation = iaa.SomeOf((0, 3), [ iaa.Fliplr(0.5), iaa.Flipud(0.5), iaa.OneOf([iaa.Affine(rotate=90), iaa.Affine(rotate=180), iaa.Affine(rotate=270)]), iaa.Multiply((0.8, 1.5)), iaa.GaussianBlur(sigma=(0.0, 5.0)) ])
my question is, why the mAP is so low ? what I can do to increase the performance ? and why the training loss decreasing while validation loss in not (fluctuating) ? I tried to add class_weight to work around the data imbalanced but I always get this error : Unknown entries in class_weight dictionary: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]. Only expected following keys: []
Model Configuration: