Open xfby2016 opened 1 year ago
same question
I guess you could first organize your data in COCO format, then register your coco json in this file: https://github.com/mlzxy/devit/blob/main/detectron2/data/datasets/builtin.py. Regarding COCO JSON format, you could take this https://github.com/alirezazareian/ovr-cnn/blob/master/ipynb/003.ipynb as reference.
Then, prepare class prototypes for your classes, like https://github.com/mlzxy/devit/blob/main/demo/build_prototypes.ipynb. Probably the background prototypes can be reused because background classes are similar. Regarding the RPN, you could consider reuse the LVIS RPN from RegionCLIP like I did, or retrain COCO RPN from scratch using all 80 classes. This script https://github.com/mlzxy/devit/blob/main/scripts/train_rpn.sh can be the entry point.
Or you could use some existing class-agnostic proposal extractor, slower but far more accurate, like this paper https://arxiv.org/abs/2111.11430
Hello :wave: Great paper and impressive results ! I am trying to test the ability of the model for few shot learning (with a small dataset) and trying to wrap my head around what to change in order to do this.
What I have done so far :
datasets/coco/annotations/
one for train and one for valdatasets/coco/
as mini_coco_PREDEFINED_SPLITS_COCO["coco"] = {
"coco_2014_train": ("coco/train2014", "coco/annotations/instances_train2014.json"),
"coco_2014_val": ("coco/val2014", "coco/annotations/instances_val2014.json"),
"coco_2014_minival": ("coco/val2014", "coco/annotations/instances_minival2014.json"),
"coco_2014_minival_100": ("coco/val2014", "coco/annotations/instances_minival2014_100.json"),
"coco_2014_valminusminival": (
"coco/val2014",
"coco/annotations/instances_valminusminival2014.json",
),
"coco_2017_train": ("coco/train2017", "coco/annotations/instances_train2017.json"),
"coco_2017_val": ("coco/val2017", "coco/annotations/instances_val2017.json"),
"coco_2017_test": ("coco/test2017", "coco/annotations/image_info_test2017.json"),
"coco_2017_test-dev": ("coco/test2017", "coco/annotations/image_info_test-dev2017.json"),
"coco_2017_val_100": ("coco/val2017", "coco/annotations/instances_val2017_100.json"),
"coco_mini_train" : ("coco/mini_coco", "coco/annotations/mini_coco_train.json"),
"coco_mini_val" : ("coco/mini_coco", "coco/annotations/mini_coco_val.json"),
}
Now I am trying to use this script to generate prototypes based on the boxes of annotations that I have in my training set :
https://github.com/mlzxy/devit/blob/main/tools/extract_instance_prototypes.py but I am not sure how to fetch the name of the dataset, is it the same as the keys of datasets in builtin.py
? Did I miss anything?
Cheers
Hi @YELKHATTABI
After modifying builtin.py
, you could set a breakpoint at https://github.com/mlzxy/devit/blob/main/detectron2/data/datasets/builtin.py#L327 and use detectron2 dataset interface (https://detectron2.readthedocs.io/en/stable/tutorials/datasets.html#metadata-for-datasets) to inspect the meta data of your dataset, e.g., classes. If you are not sure the name of the class, the function DatasetCatalog.list()
may help you.
Note that I hardcode the class names of COCO/LVIS in https://github.com/mlzxy/devit/blob/main/lib/categories.py and use them in https://github.com/mlzxy/devit/blob/main/detectron2/modeling/meta_arch/devit.py#L792 and https://github.com/mlzxy/devit/blob/main/tools/train_net.py#L95C44-L95C44 to separate base / novel classes. You may also need to add your classes split to categories.py
.
Regarding the prototypes extraction, if the extract_instance_prototypes.py
doesn't work (I write that a long time ago), you could always just extract and crop some features and compute the mean (like in the demo jupyter notebook).
Note that for bounding box annotation, you need to have a little more shot, for example 5-shot at least (which I guess is ok), in 1-shot bounding box performance is way lower than using instance masks.
Hi @YELKHATTABI
After modifying
builtin.py
, you could set a breakpoint at https://github.com/mlzxy/devit/blob/main/detectron2/data/datasets/builtin.py#L327 and use detectron2 dataset interface (https://detectron2.readthedocs.io/en/stable/tutorials/datasets.html#metadata-for-datasets) to inspect the meta data of your dataset, e.g., classes. If you are not sure the name of the class, the functionDatasetCatalog.list()
may help you.Note that I hardcode the class names of COCO/LVIS in https://github.com/mlzxy/devit/blob/main/lib/categories.py and use them in https://github.com/mlzxy/devit/blob/main/detectron2/modeling/meta_arch/devit.py#L792 and https://github.com/mlzxy/devit/blob/main/tools/train_net.py#L95C44-L95C44 to separate base / novel classes. You may also need to add your classes split to
categories.py
.Regarding the prototypes extraction, if the
extract_instance_prototypes.py
doesn't work (I write that a long time ago), you could always just extract and crop some features and compute the mean (like in the demo jupyter notebook).Note that for bounding box annotation, you need to have a little more shot, for example 5-shot at least (which I guess is ok), in 1-shot bounding box performance is way lower than using instance masks.
@mlzxy Thank you ! I'll go through the notebook it is probably simpler this way :)
Hi @YELKHATTABI
After modifying
builtin.py
, you could set a breakpoint at https://github.com/mlzxy/devit/blob/main/detectron2/data/datasets/builtin.py#L327 and use detectron2 dataset interface (https://detectron2.readthedocs.io/en/stable/tutorials/datasets.html#metadata-for-datasets) to inspect the meta data of your dataset, e.g., classes. If you are not sure the name of the class, the functionDatasetCatalog.list()
may help you.Note that I hardcode the class names of COCO/LVIS in https://github.com/mlzxy/devit/blob/main/lib/categories.py and use them in https://github.com/mlzxy/devit/blob/main/detectron2/modeling/meta_arch/devit.py#L792 and https://github.com/mlzxy/devit/blob/main/tools/train_net.py#L95C44-L95C44 to separate base / novel classes. You may also need to add your classes split to
categories.py
.Regarding the prototypes extraction, if the
extract_instance_prototypes.py
doesn't work (I write that a long time ago), you could always just extract and crop some features and compute the mean (like in the demo jupyter notebook).Note that for bounding box annotation, you need to have a little more shot, for example 5-shot at least (which I guess is ok), in 1-shot bounding box performance is way lower than using instance masks.
This method of directly modifying COCO seems a bit complicated. Is there any other method to register a new dataset just as COCO/LVIS ? @mlzxy
Great idea! I use my own data set and place it in coco format without changing the config file. I change the name of the json file trained by my own data set to fs_coco14_base_train. An error will be reported. Do you know what is going on? @mlzxy thanks
The following is the error message:
[12/20 07:35:24 detectron2]: Full config saved to output/train/few-shot/shot-10/vitl/config.yaml
[12/20 07:35:24 d2.utils.env]: Using a generated random seed 24627623
('fs_coco_test_all',)
Traceback (most recent call last):
File "/content/devit/tools/train_net.py", line 202, in
Hi, sorry for my late reply. Kind of busy recently. I would suggest not directly modifying coco, but packing your dataset in the COCO format.
Regarding that dataset repacking, I suggest two approaches:
Take this notebook https://github.com/alirezazareian/ovr-cnn/blob/master/ipynb/003.ipynb as a reference to reorganize your dataset. The output dataset format shall be correct as long as the procedure is followed. I took this approach but it is a bit cumbersome.
Use this tool, https://github.com/waspinator/pycococreator . It is much easier but I haven't used it for a long time. So it may not work.
@Kay545 It appears that your annotation object lacks the image id field. Try to use the pycococreator
tool, it shall generate the data with the right format.
@mlzxy Thank you for your thoughtful suggestions and the provided resources. I appreciate your insights into repacking the dataset without directly modifying COCO. I will explore both approaches you mentioned, taking into consideration the notebook you shared and the pycococreator tool.
Best regards.
Excellent job, I want to use it for object detection in specific datasets, but how to prepare my own dataset or what steps should I follow? Thank you again!