Closed sno6 closed 4 years ago
@sno6 what's your batch size and step size?
Here's my config, note I am training on two K80 GPUs.
MODEL:
META_ARCHITECTURE: "GeneralizedRCNN"
WEIGHT: "./outputs/pretrain/model_pretrain.pth"
BACKBONE:
CONV_BODY: "R-50-FPN"
OUT_CHANNELS: 256
RPN:
USE_FPN: True
ANCHOR_STRIDE: (4, 8, 16, 32, 64)
PRE_NMS_TOP_N_TRAIN: 2000
PRE_NMS_TOP_N_TEST: 1000
POST_NMS_TOP_N_TEST: 1000
FPN_POST_NMS_TOP_N_TEST: 1000
ROI_HEADS:
USE_FPN: True
BATCH_SIZE_PER_IMAGE: 512
ROI_BOX_HEAD:
POOLER_RESOLUTION: 7
POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125)
POOLER_SAMPLING_RATIO: 2
FEATURE_EXTRACTOR: "FPN2MLPFeatureExtractor"
PREDICTOR: "FPNPredictor"
NUM_CLASSES: 2
ROI_MASK_HEAD:
POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125)
FEATURE_EXTRACTOR: "MaskRCNNFPNFeatureExtractor"
PREDICTOR: "SeqCharMaskRCNNC4Predictor"
POOLER_RESOLUTION_H: 16
POOLER_RESOLUTION_W: 64
POOLER_SAMPLING_RATIO: 2
RESOLUTION: 28
RESOLUTION_H: 32
RESOLUTION_W: 128
SHARE_BOX_FEATURE_EXTRACTOR: False
CHAR_NUM_CLASSES: 37
USE_WEIGHTED_CHAR_MASK: True
MASK_BATCH_SIZE_PER_IM: 64
MASK_ON: True
CHAR_MASK_ON: True
SEQUENCE:
SEQ_ON: True
NUM_CHAR: 38
BOS_TOKEN: 0
MAX_LENGTH: 32
TEACHER_FORCE_RATIO: 1.0
TWO_CONV: True
DATASETS:
TRAIN: ("total_text_train",)
TEST: ("total_text_test",)
DATALOADER:
SIZE_DIVISIBILITY: 32
NUM_WORKERS: 2
ASPECT_RATIO_GROUPING: False
SOLVER:
BASE_LR: 0.0025 #0a.02
WARMUP_FACTOR: 0.1
WEIGHT_DECAY: 0.0001
STEPS: (480000, 640000)
MAX_ITER: 720000
IMS_PER_BATCH: 2
OUTPUT_DIR: "./outputs/pretrain"
TEST:
VIS: False
CHAR_THRESH: 192
IMS_PER_BATCH: 1
INPUT:
MIN_SIZE_TRAIN: (600, 800)
MAX_SIZE_TRAIN: 2333
MIN_SIZE_TEST: 800
MAX_SIZE_TEST: 1333
Update from 4 GPUs and a 4 IMS_PER_BATCH
Oh those were steps in the first photo. Yeah you wouldn't see much especially with your batch size - but it's looking better now.
Maybe @MhLiao has some thoughts on parameter optimisation?
After running for ~1 hour I see the following loss fluctuations. Is this normal? how long until I can see things start to normalize?