Closed lucasjinreal closed 4 years ago
First of all, I do not think you understand what BN head means in this discussion: https://github.com/aim-uofa/AdelaiDet/issues/43.
The part needs to modification is the HEAD instead of backbone.
For the training problem you have met, I suggest you first try training a BN head FCOS. It will help you figure out what's wrong.
AP will not be zero unless you haven't correctly loaded the checkpoint. But after all, you need either larger batch size or longer training schedule to get the performance we reported.
p.s. BN with batch size 2 per gpu can be unstable. I am not sure because we never tried that or claimed it will work. I suggest using syncBN, which is the same for export and migration.
Then how to alter head bn? convert model to onnx need eliminate all gn include backbone.. batch size 2 per gpu isn't same with batch size 16 for 8 gpus?
I think AP 0 here is wrong combination with these configuations, would u suggest a working version of blendmask RT model config? (with 3 gpus perhaps?)
Use the config I linked in the last comment, BN head FCOS. GN is not in the backbone but in FCOS head.
Use syncbn instead of BN.
_BASE_: "Base-BlendMask.yaml"
MODEL:
FCOS:
TOP_LEVELS: 1
IN_FEATURES: ["p3", "p4", "p5", "p6"]
FPN_STRIDES: [8, 16, 32, 64]
SIZES_OF_INTEREST: [64, 128, 256]
NUM_SHARE_CONVS: 3
NUM_CLS_CONVS: 0
NUM_BOX_CONVS: 0
-> NORM: "SyncBN"
BASIS_MODULE:
NUM_CONVS: 2
INPUT:
MIN_SIZE_TRAIN: (440, 462, 484, 506, 528, 550)
MAX_SIZE_TRAIN: 916
MIN_SIZE_TEST: 550
MAX_SIZE_TEST: 916
I am sorry, do u mean this one?
Yes, for information related to onnx export, please refer to this page: https://github.com/aim-uofa/AdelaiDet/tree/master/onnx
@jinfagang did you happen to solve this?
Here is the train command and eval command I am use:
I am using 3 GPUs to train, and I have changed lr along with batch size:
the model I changed:
Is BlendMask RT really works?