Open fat-921 opened 1 year ago
this is my config: data_root = '/labelme2coco_unlabeled_20230704/' num_classes = 4 batch_size = 2 img_scale = (2048, 920) METAINFO = { 'classes': ('污点', '黑棉', '破损', '褶皱'),
'palette':
[(220, 20, 60), (119, 11, 32), (0, 0, 142), (0, 0, 230)]
}
backend_args = None train_pipeline = [ dict(type='LoadImageFromFile', backend_args=None), dict(type='LoadAnnotations', with_bbox=True), dict(type='Resize', scale=img_scale, keep_ratio=True), dict(type='RandomFlip', prob=0.5), dict(type='PackDetInputs') ] test_pipeline = [ dict( type='LoadImageFromFile', backend_args=None, imdecode_backend='pillow'), dict( type='FixScaleResize', scale=img_scale, keep_ratio=True, backend='pillow'), dict(type='LoadAnnotations', with_bbox=True), dict( type='PackDetInputs', meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape', 'scale_factor', 'text', 'custom_entities')) ] train_dataloader = dict( batch_size=batch_size, num_workers=2, persistent_workers=True, sampler=dict(type='DefaultSampler', shuffle=True), batch_sampler=dict(type='AspectRatioBatchSampler'), dataset=dict( type='CocoDataset', data_root=data_root, metainfo=METAINFO, ann_file='train.json', data_prefix=dict(img=''), filter_cfg=dict(filter_empty_gt=True, min_size=32), pipeline=[ dict(type='LoadImageFromFile', backend_args=None), dict(type='LoadAnnotations', with_bbox=True), dict(type='Resize', scale=img_scale, keep_ratio=True), dict(type='RandomFlip', prob=0.5), dict(type='PackDetInputs') ], backend_args=None)) val_dataloader = dict( batch_size=1, num_workers=2, persistent_workers=True, drop_last=False, sampler=dict(type='DefaultSampler', shuffle=False), dataset=dict( type='CocoDataset', data_root=data_root, metainfo=METAINFO, ann_file='val.json', data_prefix=dict(img=''), test_mode=True, pipeline=[ dict( type='LoadImageFromFile', backend_args=None, imdecode_backend='pillow'), dict( type='FixScaleResize', scale=img_scale, keep_ratio=True, backend='pillow'), dict(type='LoadAnnotations', with_bbox=True), dict( type='PackDetInputs', meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape', 'scale_factor', 'text', 'custom_entities')) ], backend_args=None, return_classes=True)) test_dataloader = dict( batch_size=1, num_workers=2, persistent_workers=True, drop_last=False, sampler=dict(type='DefaultSampler', shuffle=False), dataset=dict( type='CocoDataset', data_root=data_root, metainfo=METAINFO, ann_file='val.json', data_prefix=dict(img=''), test_mode=True, pipeline=[ dict( type='LoadImageFromFile', backend_args=None, imdecode_backend='pillow'), dict( type='FixScaleResize', scale=img_scale, keep_ratio=True, backend='pillow'), dict(type='LoadAnnotations', with_bbox=True), dict( type='PackDetInputs', meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape', 'scale_factor', 'text', 'custom_entities')) ], backend_args=None, return_classes=True)) val_evaluator = dict( type='CocoMetric', ann_file=data_root + '/val.json', metric='bbox', format_only=False, backend_args=None) test_evaluator = dict( type='CocoMetric', ann_file=data_root + '/val.json', metric='bbox', format_only=False, backend_args=None) train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=12, val_interval=1) val_cfg = dict(type='ValLoop') test_cfg = dict(type='TestLoop') param_scheduler = [ dict( type='LinearLR', start_factor=0.001, by_epoch=False, begin=0, end=500), dict( type='MultiStepLR', begin=0, end=12, by_epoch=True, milestones=[8, 11], gamma=0.1) ] optim_wrapper = dict( type='OptimWrapper', optimizer=dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001)) auto_scale_lr = dict(enable=False, base_batch_size=16) default_scope = 'mmdet' default_hooks = dict( timer=dict(type='IterTimerHook'), logger=dict(type='LoggerHook', interval=10), param_scheduler=dict(type='ParamSchedulerHook'), checkpoint=dict(type='CheckpointHook', interval=5, max_keep_ckpts=3, save_best='auto'), sampler_seed=dict(type='DistSamplerSeedHook'), visualization=dict(type='DetVisualizationHook')) env_cfg = dict( cudnn_benchmark=False, mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0), dist_cfg=dict(backend='nccl')) vis_backends = [dict(type='LocalVisBackend')] visualizer = dict( type='DetLocalVisualizer', vis_backends=[dict(type='LocalVisBackend')], name='visualizer') log_processor = dict(type='LogProcessor', window_size=50, by_epoch=True) log_level = 'INFO' load_from = None resume = False lang_model_name = 'bert-base-uncased' model = dict( type='GLIP', data_preprocessor=dict( type='DetDataPreprocessor', mean=[103.53, 116.28, 123.675], std=[57.375, 57.12, 58.395], bgr_to_rgb=False, pad_size_divisor=32), backbone=dict( type='SwinTransformer', embed_dims=96, depths=[2, 2, 6, 2], num_heads=[3, 6, 12, 24], window_size=7, mlp_ratio=4, qkv_bias=True, qk_scale=None, drop_rate=0.0, attn_drop_rate=0.0, drop_path_rate=0.2, patch_norm=True, out_indices=(1, 2, 3), with_cp=False, convert_weights=False), neck=dict( type='FPN', in_channels=[192, 384, 768], out_channels=256, start_level=0, relu_before_extra_convs=True, add_extra_convs='on_output', num_outs=5), bbox_head=dict( type='ATSSVLFusionHead',
lang_model_name='../checkpoints/bert-base-uncased/',
num_classes=num_classes,
in_channels=256,
feat_channels=256,
anchor_generator=dict(
type='AnchorGenerator',
ratios=[1.0],
octave_base_scale=8,
scales_per_octave=1,
strides=[8, 16, 32, 64, 128],
center_offset=0.5),
bbox_coder=dict(
type='DeltaXYWHBBoxCoderForGLIP',
target_means=[0.0, 0.0, 0.0, 0.0],
target_stds=[0.1, 0.1, 0.2, 0.2])),
language_model=dict(type='BertModel',
# name='bert-base-uncased',
name='../checkpoints/bert-base-uncased/'
),
train_cfg=dict(
assigner=dict(type='ATSSAssigner', topk=9),
allowed_border=-1,
pos_weight=-1,
debug=False),
test_cfg=dict(
nms_pre=1000,
min_bbox_size=0,
score_thr=0.05,
nms=dict(type='nms', iou_threshold=0.6),
max_per_img=100))
launcher = 'none' work_dir = './work_dirs/glip_atss_swin-t_a_fpn_dyhead_pretrain_my_datasets'
bert-base-uncased
include:
config.json
, pytorch_model.bin
, tokenizer.json
, tokenizer_config.json
, vocab.txt
Did I miss any other files?
same issue as you, have you solved the problem? or it's just an incomplete code release?
same issue as you, have you solved the problem? or it's just an incomplete code release?
I don't solve this problem, GLIP fine-tuning not be supported。。training GLIP may be supported in the next release @xiaomoguhzz
Thanks for your error report and we appreciate it a lot.
Checklist
Describe the bug I have this problem when using GLIP. I downloaded the 'bert-base-uncased' independently on huggingface. The data set uses data I made myself
Reproduction
Did you make any modifications on the code or config? Did you understand what you have modified? I modified
data_root
,num_classes
,img_scale
,metainfo
,lang_model_name
,language_model
What dataset did you use? my dataset use
CocoDataset
Environment sys.platform: linux Python: 3.9.16 (main, Mar 8 2023, 14:00:05) [GCC 11.2.0] CUDA available: True numpy_random_seed: 2147483648 GPU 0,1: NVIDIA GeForce RTX 3090 Ti CUDA_HOME: :/usr/local/cuda-11.3 GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0 PyTorch: 1.12.1+cu113 PyTorch compiling details: PyTorch built with:
TorchVision: 0.13.1+cu113 OpenCV: 4.7.0 MMEngine: 0.7.1 MMDetection: 3.1.0+
Error traceback If applicable, paste the error trackback here.
Bug fix If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!