JoyHuYY1412 / LST_LVIS

MIT License
47 stars 6 forks source link

Learning to Segment the Tail


In this repository, we release code for Learning to Segment The Tail (LST). The code is directly modified from the project [maskrcnn_benchmark](, which is an excellent codebase! If you get any problem that causes you unable to run the project, you can check the issues under [maskrcnn_benchmark]( first. ## Installation Please following []( for maskrcnn_benchmark. For experiments on [LVIS_v0.5]( dataset, you need to use [lvis-api]( ## LVIS Dataset After downloading LVIS_v0.5 dataset (the images are the same as COCO 2017 version), we recommend to symlink the path to the lvis dataset to datasets/ as follows ```bash # symlink the lvis dataset cd ~/github/LST_LVIS mkdir -p datasets/lvis ln -s /path_to_lvis_dataset/annotations datasets/lvis/annotations ln -s /path_to_coco_dataset/images datasets/lvis/images ``` A detailed visualization demo for LVIS is [LVIS_visualization]( You'll find it is the most useful thing you can get from this repo :P ## Dataset Pre-processing and Indices Generation [dataset_preprocess.ipynb]( LVIS dataset is split into the base set and sets for the incremental phases. [balanced_replay.ipynb]( We generate indices to load the LVIS dataset offline using the balanced replay scheme discussed in our paper. ## Training Our pre-trained model is [model]( You can trim the model and load it for LVIS training as in [trim_model]( Modifications to the backbone follows [MaskX R-CNN]( You can also check our paper for detail. ### [training for base]( The base training is the same as conventional training. For example, to train a model with 8 GPUs you can run: ```bash python -m torch.distributed.launch --nproc_per_node=8 /path_to_maskrcnn_benchmark/tools/ --use-tensorboard --config-file "/path/to/config/train_file.yaml" MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN 1000 ``` The details about `MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN` is discussed in [maskrcnn-benchmark]( Edit [this line]( to initialze the dataloader with corresponding sorted category ids. ### [training for incremental steps]( The training for each incremental phase is armed with our data balanced replay. It needs to be initialized properly [here](, providing the corresponding external img-id/cls-id pairs for data-loading. ### [get distillation]( We use ground truth bounding boxes to get prediction logits using the model trained from last step. Change [this]( to decide which classes to be distilled. Here is an example for running: ```bash python ./tools/ --use-tensorboard --config-file "/path/to/config/get_distillation_file.yaml" MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN 1000 ``` The output distillation logits are saved in json format. ## Evaluation The evaluation for LVIS is a little bit different from COCO since it is not exhausted annotated, which is discussed in detail in [Gupta et al.'s work]( We also report the AP for each phase and each class, which can provide better analysis. You can run: ```bash export NGPUS=8 python -m torch.distributed.launch --nproc_per_node=$NGPUS /path_to_maskrcnn_benchmark/tools/ --config-file "/path/to/config/train_file.yaml" ``` We also provide periodically testing to check the result better, as discussed in this [issue]( Thanks for all the previous work and the sharing of their codes. Sorry for my ugly code and I appreciate your advice.