Closed BernhardGlueck closed 4 years ago
I removed the small polygons for now, which seems to work fine.. However now during evaluation i encounter this error:
Traceback (most recent call last):
File "/home/bglueck/Work/Python/augmented-maskrcnn/train.py", line 238, in
Hello @BernhardGlueck, thanks for using the repo and giving your feedbacks!
Secondly, the code assumes that category_ids are continuous from 0 to x ... Our category ids come from a licensed SAAS tool for image labeling, and as such we have to remap them after exporting from there, as we cannot change the SAAS providers code ..
I did not get this part. This repo automatically creates category mapping by reading the categories from the coco dataset file (.json). You are telling me that your coco file does not contain the categories correctly? If thats the case I can share my coco category editing script.
I removed the small polygons for now, which seems to work fine.. However now during evaluation i encounter this error:
Traceback (most recent call last): File "/home/bglueck/Work/Python/augmented-maskrcnn/train.py", line 238, in train(config) File "/home/bglueck/Work/Python/augmented-maskrcnn/train.py", line 196, in train writer=writer, File "/home/bglueck/Work/Python/augmented-maskrcnn/core/engine.py", line 159, in train_one_epoch mode="train", File "/home/bglueck/Work/Python/augmented-maskrcnn/core/engine.py", line 301, in _calculate_coco_ap _log_coco_results(writer, mode, category, coco_evaluator, iter_num) File "/home/bglueck/Work/Python/augmented-maskrcnn/core/engine.py", line 218, in _log_coco_results coco_evaluator.accumulate() File "/home/bglueck/Work/Python/augmented-maskrcnn/core/coco_eval.py", line 55, in accumulate coco_eval.accumulate() File "/home/bglueck/anaconda3/envs/augmented-maskrcnn/lib/python3.6/site-packages/pycocotools/cocoeval.py", line 358, in accumulate E = [self.evalImgs[Nk + Na + i] for i in i_list] File "/home/bglueck/anaconda3/envs/augmented-maskrcnn/lib/python3.6/site-packages/pycocotools/cocoeval.py", line 358, in E = [self.evalImgs[Nk + Na + i] for i in i_list] IndexError: list index out of range
That error seems to be related with pycocotools package. Can you give detail on your working environment? Is it Linux/Python3.6 with conda environments? How did you install your environment, currently there are two ways of installing dependencies in this repo:
Thank you for your help.
I installed via the environment.yml file. Since it fails with an index out of range error within pycoco, i assume pycocotools are installed correctly ( they are i checked )
On the categories front:
In our source data we have category ids like "33720", "33712" etc... They are correctly referenced.
If i try to try with that then torch crashes with some index out of range error. If i remap the categories in the source file ( i wrote a preprocessing script now ) to the range 1-x so that the above becomes "1", "2" Then that crash does not happen anymore.
Yes this repo assumes that image, category and annotation ids are starting from 1 in the coco dataset json. If it is not the case, a proper coco file should be created from your dataset.
So after your preprocessing script, is there any ongoing errors or everything is fixed?
Yes that's what i've done now ( as well a getting rid of too small annotations, which seems to be related to rasterizaition of those polygons.. e.g if I increase the resolution for evaluation of the segmentation polygons than the zero bounding box polygons decrease )
Back to the topic: Yes this error persists after all those changes: Traceback (most recent call last): File "/home/bglueck/Work/Python/augmented-maskrcnn/train.py", line 238, in train(config) File "/home/bglueck/Work/Python/augmented-maskrcnn/train.py", line 196, in train writer=writer, File "/home/bglueck/Work/Python/augmented-maskrcnn/core/engine.py", line 159, in train_one_epoch mode="train", File "/home/bglueck/Work/Python/augmented-maskrcnn/core/engine.py", line 301, in _calculate_coco_ap _log_coco_results(writer, mode, category, coco_evaluator, iter_num) File "/home/bglueck/Work/Python/augmented-maskrcnn/core/engine.py", line 218, in _log_coco_results coco_evaluator.accumulate() File "/home/bglueck/Work/Python/augmented-maskrcnn/core/coco_eval.py", line 55, in accumulate coco_eval.accumulate() File "/home/bglueck/anaconda3/envs/augmented-maskrcnn/lib/python3.6/site-packages/pycocotools/cocoeval.py", line 358, in accumulate E = [self.evalImgs[Nk + Na + i] for i in i_list] File "/home/bglueck/anaconda3/envs/augmented-maskrcnn/lib/python3.6/site-packages/pycocotools/cocoeval.py", line 358, in E = [self.evalImgs[Nk + Na + i] for i in i_list] IndexError: list index out of range
Are you providing seperate paths for COCO_PATH and COCO_PATH_VAL in config file or are you giving empty string for the COCO_PATH_VAL variable and let the training script handle the train-val split?
The same, i let the training script handle the split
I have just tried a training with a minimal dataset of 8 images with different config combinations on Linux and Python 3.6 with a conda environment and did not receive any errors on the evaluation step. Are you sure that all image, annotation and category ids are properly arranged in your coco dataset file?
Because the error message you are sending is related with an inconsistency between evalImgs and imgIds properties of the COCOeval object, you can see the related line here.
The same, i let the training script handle the split
By the way you should keep COCO_PATH_VAL as an empty string ("") if you want the script to handle the split. If you put the same path for both COCO_PATH and COCO_PATH_VAL, full dataset would be used during the training and evaluation, which would be wrong.
Thats my config.yml:
SEED: 999
DATA_ROOT: "datasets/field"
COCO_PATH: "datasets/field/index2.json"
DATA_ROOT_VAL: ""
COCO_PATH_VAL: ""
EXPERIMENT_NAME: "exp1"
MODEL_PATH: "experiments/exp1/maskrcnn-best.pt"
OPTIMIZER_NAME: "sgd"
OPTIMIZER_WEIGHT_DECAY: 0.0005 OPTIMIZER_MOMENTUM: 0.9 OPTIMIZER_BETAS: [0.9, 0.999] OPTIMIZER_EPS: 0.00000001 OPTIMIZER_AMSGRAD: False OPTIMIZER_ADABOUND_GAMMA: 0.001 OPTIMIZER_ADABOUND_FINAL_LR: 0.1
LEARNING_RATE: 0.0001
LEARNING_RATE_STEP_SIZE: 3 LEARNING_RATE_GAMMA: 0.1
TRAINABLE_BACKBONE_LAYERS: 3
RPN_ANCHOR_SIZES: [32, 64, 128, 256, 512] RPN_ANCHOR_ASPECT_RATIOS: [0.5, 1.0, 2.0]
LOG_FREQ: 40
TRAIN_SPLIT_RATE: 0.8
BATCH_SIZE: 1
NUM_EPOCH: 10
DEVICE: "cuda:0"
NUM_WORKERS: 0
Thats my config.yml: ...
All seems to be okay with this config. Are you sure on the validity of the image, annotation and category ids present in "datasets/field/index2.json" file?
I can give you the dataset ( its just a toy dataset i don't expect any good results of it ) .. And its just 26 mb ... So far i think everything should be in order with the dataset, but i am pretty new to mask-rcnn etc..
index.json = The original output from our labeling tool index2.json = The preprocessed output from my tool script ( categories remapped, smal polygons removed )
Thanks a lot for providing a sample dataset, now I can recreate the error but couldnt find the reason at the moment.
I will update when I find the reason for the error.
I have spotted the issue, since your set doesnt contain all the categories, it fails when doing category based coco ap calculation. I will add a fix so that when all categories are not present, it will only calculate and log overall coco ap instead of category based. That should fix your error.
Oh ! Thank you, in the meantime i will add that to my preprocessor. So removing any categories from the categories arrays in the coco file that are NOT referenced by any annotations should fix it am i correct ?
Yes but you need to remap the remaining categories. For instance, you have categories 1, 2, 3, 4 in coco file but none of the annotations belong to 2 or 3. Then you need to remove these 2 categories and remap all the annotations that belong to category 4 to 2 so that only category ids 1 and 2 are present in the coco file.
I will try my best to push the fixed version as soon as possible. Thank you for providing the minimal sample to recreate the bug!
I have implemented it in my preprocessor already, and it works now :-) Now its just about getting the data/quality up for me .
Glad to hear that! With the latest commits all should be fixed now. I have also added an option for category based/overall coco ap calculation selection fron config.yml
Closing this issue. If you face any other bug feel free to open new issue :)
Hi ! First, great implementation, much better than most out there when it comes to code clarity.
I am running into some small issues: When dataset entries are handed of too albumentations, albumentations throws a value array because the bbox of an annotation is too small. It seems to me that after checking my dataset that this comes from numerical imprecision when rastering the mask polygons and calculating the bbox based on those ... Any advice ?
Secondly, the code assumes that category_ids are continuous from 0 to x ... i can work around that by remapping them, but there are a lot of places in the codebase where this is used and so far i have not managed to find one central place where to do the remapping so that it gets picked up by all other parts of the codebase. Our category ids come from a licensed SAAS tool for image labeling, and as such we have to remap them after exporting from there, as we cannot change the SAAS providers code ..