Maelic / SGG-Benchmark

A New Benchmark for Scene Graph Generation, targeting real-world applications
MIT License
46 stars 5 forks source link

SGDet on Custom Images #3

Open liw1st opened 7 months ago

liw1st commented 7 months ago

When I tested sgdet on custom images, the results seemed to be wrong. I used this command to test : CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch --master_port 10027 --nproc_per_node=1 tools/relation_test_net.py --config-file "/media/how/data2/CODES/SGG-Benchmark-main/configs/VG150/baseline/e2e_relation_X_101_32_8_FPN_1x.yaml" MODEL.ROI_RELATION_HEAD.USE_GT_BOX False MODEL.ROI_RELATION_HEAD.USE_GT_OBJECT_LABEL False MODEL.ROI_RELATION_HEAD.PREDICTOR CausalAnalysisPredictor MODEL.ROI_RELATION_HEAD.CAUSAL.EFFECT_TYPE TDE MODEL.ROI_RELATION_HEAD.CAUSAL.FUSION_TYPE sum MODEL.ROI_RELATION_HEAD.CAUSAL.CONTEXT_LAYER motifs TEST.IMS_PER_BATCH 1 DTYPE "float16" GLOVE_DIR /media/how/data2/CODES/SGG-Benchmark-main/glove MODEL.PRETRAINED_DETECTOR_CKPT /media/how/data2/CODES/SGG-Benchmark-main/checkpoints/pretrained_faster_rcnn/model_final.pth OUTPUT_DIR /media/how/data2/CODES/SGG-Benchmark-main/checkpoints/upload_causal_motif_sgdet TEST.CUSTUM_EVAL True TEST.CUSTUM_PATH /media/how/data2/CODES/SGG-Benchmark-main/custom_test/custom_images_coco DETECTED_SGG_DIR /media/how/data2/CODES/SGG-Benchmark-main/custom_test/detected_sgg.

the following is the cfg information: DATALOADER: ASPECT_RATIO_GROUPING: True NUM_WORKERS: 0 SIZE_DIVISIBILITY: 32 DATASETS: NAME: TEST: ('VG_stanford_filtered_with_attribute_test',) TO_TEST: TRAIN: ('VG_stanford_filtered_with_attribute_train',) VAL: ('VG_stanford_filtered_with_attribute_val',) DETECTED_SGG_DIR: /media/how/data2/CODES/SGG-Benchmark-main/custom_test/detected_sgg DTYPE: float16 GLOVE_DIR: /media/how/data2/CODES/SGG-Benchmark-main/glove INPUT: BRIGHTNESS: 0.0 CONTRAST: 0.0 FLIP_PROB_TRAIN: 0.5 HUE: 0.0 MAX_SIZE_TEST: 1000 MAX_SIZE_TRAIN: 1000 MIN_SIZE_TEST: 600 MIN_SIZE_TRAIN: 600 PADDING: False PIXEL_MEAN: [102.9801, 115.9465, 122.7717] PIXEL_STD: [1.0, 1.0, 1.0] SATURATION: 0.0 TO_BGR255: True VERTICAL_FLIP_PROB_TRAIN: 0.0 METRIC_TO_TRACK: mR MODEL: ATTRIBUTE_ON: False BACKBONE: EXTRA_CONFIG: FREEZE: False FREEZE_CONV_BODY_AT: 2 NMS_THRESH: 0.7 TYPE: R-101-FPN BOX_HEAD: True CLS_AGNOSTIC_BBOX_REG: False DEVICE: cuda FBNET: ARCH: default ARCH_DEF: BN_TYPE: bn DET_HEAD_BLOCKS: [] DET_HEAD_LAST_SCALE: 1.0 DET_HEAD_STRIDE: 0 DW_CONV_SKIP_BN: True DW_CONV_SKIP_RELU: True KPTS_HEAD_BLOCKS: [] KPTS_HEAD_LAST_SCALE: 0.0 KPTS_HEAD_STRIDE: 0 MASK_HEAD_BLOCKS: [] MASK_HEAD_LAST_SCALE: 0.0 MASK_HEAD_STRIDE: 0 RPN_BN_TYPE: RPN_HEAD_BLOCKS: 0 SCALE_FACTOR: 1.0 WIDTH_DIVISOR: 1 FLIP_AUG: False FPN: USE_GN: False USE_RELU: False GROUP_NORM: DIM_PER_GP: -1 EPSILON: 1e-05 NUM_GROUPS: 32 MASK_ON: False META_ARCHITECTURE: GeneralizedRCNN PRETRAINED_DETECTOR_CKPT: /media/how/data2/CODES/SGG-Benchmark-main/checkpoints/pretrained_faster_rcnn/model_final.pth RELATION_ON: True RESNETS: BACKBONE_OUT_CHANNELS: 256 DEFORMABLE_GROUPS: 1 NUM_GROUPS: 32 RES2_OUT_CHANNELS: 256 RES5_DILATION: 1 STAGE_WITH_DCN: (False, False, False, False) STEM_FUNC: StemWithFixedBatchNorm STEM_OUT_CHANNELS: 64 STRIDE_IN_1X1: False TRANS_FUNC: BottleneckWithFixedBatchNorm WIDTH_PER_GROUP: 8 WITH_MODULATED_DCN: False RETINANET: ANCHOR_SIZES: (32, 64, 128, 256, 512) ANCHOR_STRIDES: (8, 16, 32, 64, 128) ASPECT_RATIOS: (0.5, 1.0, 2.0) BBOX_REG_BETA: 0.11 BBOX_REG_WEIGHT: 4.0 BG_IOU_THRESHOLD: 0.4 FG_IOU_THRESHOLD: 0.5 INFERENCE_TH: 0.05 LOSS_ALPHA: 0.25 LOSS_GAMMA: 2.0 NMS_TH: 0.4 NUM_CLASSES: 81 NUM_CONVS: 4 OCTAVE: 2.0 PRE_NMS_TOP_N: 1000 PRIOR_PROB: 0.01 SCALES_PER_OCTAVE: 3 STRADDLE_THRESH: 0 USE_C5: True RETINANET_ON: False ROI_ATTRIBUTE_HEAD: ATTRIBUTE_BGFG_RATIO: 3 ATTRIBUTE_BGFG_SAMPLE: True ATTRIBUTE_LOSS_WEIGHT: 1.0 FEATURE_EXTRACTOR: FPN2MLPFeatureExtractor MAX_ATTRIBUTES: 10 NUM_ATTRIBUTES: 201 POS_WEIGHT: 50.0 PREDICTOR: FPNPredictor SHARE_BOX_FEATURE_EXTRACTOR: True USE_BINARY_LOSS: True ROI_BOX_HEAD: CONV_HEAD_DIM: 256 DILATION: 1 FEATURE_EXTRACTOR: FPN2MLPFeatureExtractor MLP_HEAD_DIM: 4096 NUM_CLASSES: 151 NUM_STACKED_CONVS: 4 POOLER_RESOLUTION: 7 POOLER_SAMPLING_RATIO: 2 POOLER_SCALES: (0.25, 0.125, 0.0625, 0.03125) PREDICTOR: FPNPredictor USE_GN: False ROI_HEADS: BATCH_SIZE_PER_IMAGE: 256 BBOX_REG_WEIGHTS: (10.0, 10.0, 5.0, 5.0) BG_IOU_THRESHOLD: 0.3 DETECTIONS_PER_IMG: 40 FG_IOU_THRESHOLD: 0.5 NMS: 0.3 NMS_FILTER_DUPLICATES: True POSITIVE_FRACTION: 0.5 POST_NMS_PER_CLS_TOPN: 300 SCORE_THRESH: 0.01 USE_FPN: True ROI_MASK_HEAD: CONV_LAYERS: (256, 256, 256, 256) DILATION: 1 FEATURE_EXTRACTOR: ResNet50Conv5ROIFeatureExtractor MLP_HEAD_DIM: 1024 POOLER_RESOLUTION: 14 POOLER_SAMPLING_RATIO: 0 POOLER_SCALES: (0.0625,) POSTPROCESS_MASKS: False POSTPROCESS_MASKS_THRESHOLD: 0.5 PREDICTOR: MaskRCNNC4Predictor RESOLUTION: 14 SHARE_BOX_FEATURE_EXTRACTOR: True USE_GN: False ROI_RELATION_HEAD: ADD_GTBOX_TO_PROPOSAL_IN_TRAIN: True BATCH_SIZE_PER_IMAGE: 1024 CAUSAL: CONTEXT_LAYER: motifs EFFECT_ANALYSIS: True EFFECT_TYPE: TDE FUSION_TYPE: sum SEPARATE_SPATIAL: False SPATIAL_FOR_VISION: True CLASSIFIER: linear CONTEXT_DROPOUT_RATE: 0.2 CONTEXT_HIDDEN_DIM: 512 CONTEXT_OBJ_LAYER: 1 CONTEXT_POOLING_DIM: 4096 CONTEXT_REL_LAYER: 1 EMBED_DIM: 200 FEATURE_EXTRACTOR: RelationFeatureExtractor LABEL_SMOOTHING_LOSS: False NUM_CLASSES: 51 NUM_SAMPLE_PER_GT_REL: 4 POOLING_ALL_LEVELS: True POSITIVE_FRACTION: 0.25 PREDICTOR: CausalAnalysisPredictor PREDICT_USE_VISION: True REL_PROP: [0.01858, 0.00057, 0.00051, 0.00109, 0.0015, 0.00489, 0.00432, 0.02913, 0.00245, 0.00121, 0.00404, 0.0011, 0.00132, 0.00172, 5e-05, 0.00242, 0.0005, 0.00048, 0.00208, 0.15608, 0.0265, 0.06091, 0.009, 0.00183, 0.00225, 0.0009, 0.00028, 0.00077, 0.04844, 0.08645, 0.31621, 0.00088, 0.00301, 0.00042, 0.00186, 0.001, 0.00027, 0.01012, 0.0001, 0.01286, 0.00647, 0.00084, 0.01077, 0.00132, 0.00069, 0.00376, 0.00214, 0.11424, 0.01205, 0.02958] REQUIRE_BOX_OVERLAP: False TRANSFORMER: DROPOUT_RATE: 0.1 INNER_DIM: 2048 KEY_DIM: 64 NUM_HEAD: 8 OBJ_LAYER: 4 REL_LAYER: 2 VAL_DIM: 64 USE_FREQUENCY_BIAS: True USE_GT_BOX: False USE_GT_OBJECT_LABEL: False RPN: ANCHOR_SIZES: (32, 64, 128, 256, 512) ANCHOR_STRIDE: (4, 8, 16, 32, 64) ASPECT_RATIOS: (0.23232838, 0.63365731, 1.28478321, 3.15089189) BATCH_SIZE_PER_IMAGE: 256 BG_IOU_THRESHOLD: 0.3 FG_IOU_THRESHOLD: 0.7 FPN_POST_NMS_PER_BATCH: False FPN_POST_NMS_TOP_N_TEST: 1000 FPN_POST_NMS_TOP_N_TRAIN: 1000 MIN_SIZE: 0 POSITIVE_FRACTION: 0.5 POST_NMS_TOP_N_TEST: 1000 POST_NMS_TOP_N_TRAIN: 1000 PRE_NMS_TOP_N_TEST: 6000 PRE_NMS_TOP_N_TRAIN: 6000 RPN_HEAD: SingleConvRPNHead RPN_MID_CHANNEL: 256 STRADDLE_THRESH: 0 USE_FPN: True RPN_ONLY: False VGG: VGG16_OUT_CHANNELS: 512 WEIGHT: catalog://ImageNetPretrained/FAIR/20171220/X-101-32x8d YOLO: IMG_SIZE: 640 OUT_CHANNELS: 256 SIZE: yolov8l WEIGHTS: OUTPUT_DIR: /media/how/data2/CODES/SGG-Benchmark-main/checkpoints/upload_causal_motif_sgdet PATHS_CATALOG: /media/how/data2/CODES/SGG-Benchmark-main/sgg_benchmark/config/paths_catalog.py PATHS_DATA: /media/how/data2/CODES/SGG-Benchmark-main/datasets/vg/ SOLVER: BASE_LR: 0.01 BIAS_LR_FACTOR: 1 CHECKPOINT_PERIOD: 2000 CLIP_NORM: 5.0 GAMMA: 0.1 GRAD_NORM_CLIP: 5.0 IMS_PER_BATCH: 16 MAX_EPOCH: 100 MAX_ITER: 40000 MOMENTUM: 0.9 PRE_VAL: True PRINT_GRAD_FREQ: 4000 SCHEDULE: COOLDOWN: 0 FACTOR: 0.1 MAX_DECAY_STEP: 3 PATIENCE: 2 THRESHOLD: 0.001 TYPE: WarmupReduceLROnPlateau STEPS: (10000, 16000) TO_VAL: True UPDATE_SCHEDULE_DURING_LOAD: False VAL_PERIOD: 2000 WARMUP_FACTOR: 0.1 WARMUP_ITERS: 500 WARMUP_METHOD: linear WEIGHT_DECAY: 0.0001 WEIGHT_DECAY_BIAS: 0.0 TEST: ALLOW_LOAD_FROM_CACHE: False BBOX_AUG: ENABLED: False H_FLIP: False MAX_SIZE: 4000 SCALES: () SCALE_H_FLIP: False CUSTUM_EVAL: True CUSTUM_PATH: /media/how/data2/CODES/SGG-Benchmark-main/custom_test/custom_images_coco DETECTIONS_PER_IMG: 100 EXPECTED_RESULTS: [] EXPECTED_RESULTS_SIGMA_TOL: 4 IMS_PER_BATCH: 1 INFORMATIVE: False RELATION: IOU_THRESHOLD: 0.5 LATER_NMS_PREDICTION_THRES: 0.5 MULTIPLE_PREDS: False REQUIRE_OVERLAP: False SYNC_GATHER: True SAVE_PROPOSALS: False

Here are the visualizations: 2024-04-18 09-16-40 的屏幕截图 What's the problem?

Maelic commented 7 months ago

It looks like you have the wrong config file, there are no fork or bowl classes in the VG150 dataset. Try checking all your paths in path_catalog.py and your config file.

I made a new code for a quick webcam demo here, can you try it out and tell me if it works?

liw1st commented 7 months ago

It looks like you have the wrong config file, there are no fork or bowl classes in the VG150 dataset. Try checking all your paths in path_catalog.py and your config file.

I made a new code for a quick webcam demo here, can you try it out and tell me if it works?

2024-04-18 19-26-02 的屏幕截图 2024-04-18 19-26-05 的屏幕截图 Thank you for your reply. I checked the path_catalog.py and config file. The VG-SGG-with-attri.h5, VG-SGG-dicts-with-attri.json and image_data.json are from KaihuaTang. Are they wrong? Besides, what's the use of zeroshot_file and informative_file? If they are important, where to download them? I will try the webcam demo later.

liw1st commented 7 months ago

When I tested the webcam demo, the following error has occurred: Traceback (most recent call last): File "/media/how/data2/CODES/SGG-Benchmark-main/demo/webcam_demo.py", line 59, in main(config_path, dict_file, weights) File "/media/how/data2/CODES/SGG-Benchmark-main/demo/webcam_demo.py", line 23, in main img, graph = model.predict(frame, visu=True) File "/media/how/data2/CODES/SGG-Benchmark-main/demo/demo_model.py", line 84, in predict predictions = self.model(img_list, targets) File "/home/how/anaconda3/envs/sgbm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, kwargs) File "/media/how/data2/CODES/SGG-Benchmark-main/sgg_benchmark/modeling/detector/generalized_rcnn.py", line 53, in forward x, result, detector_losses = self.roi_heads(features, proposals, targets, logger) File "/home/how/anaconda3/envs/sgbm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, *kwargs) File "/media/how/data2/CODES/SGG-Benchmark-main/sgg_benchmark/modeling/roi_heads/roi_heads.py", line 25, in forward x, detections, loss_box = self.box(features, proposals, targets) File "/home/how/anaconda3/envs/sgbm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(input, kwargs) File "/media/how/data2/CODES/SGG-Benchmark-main/sgg_benchmark/modeling/roi_heads/box_head/box_head.py", line 53, in forward proposals = [target.copy_with_fields(["labels"]) for target in targets] #, "attributes" TypeError: 'NoneType' object is not iterable

Maelic commented 7 months ago

When I tested the webcam demo, the following error has occurred: Traceback (most recent call last): File "/media/how/data2/CODES/SGG-Benchmark-main/demo/webcam_demo.py", line 59, in main(config_path, dict_file, weights) File "/media/how/data2/CODES/SGG-Benchmark-main/demo/webcam_demo.py", line 23, in main img, graph = model.predict(frame, visu=True) File "/media/how/data2/CODES/SGG-Benchmark-main/demo/demo_model.py", line 84, in predict predictions = self.model(img_list, targets) File "/home/how/anaconda3/envs/sgbm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, kwargs) File "/media/how/data2/CODES/SGG-Benchmark-main/sgg_benchmark/modeling/detector/generalized_rcnn.py", line 53, in forward x, result, detector_losses = self.roi_heads(features, proposals, targets, logger) File "/home/how/anaconda3/envs/sgbm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, *kwargs) File "/media/how/data2/CODES/SGG-Benchmark-main/sgg_benchmark/modeling/roi_heads/roi_heads.py", line 25, in forward x, detections, loss_box = self.box(features, proposals, targets) File "/home/how/anaconda3/envs/sgbm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(input, kwargs) File "/media/how/data2/CODES/SGG-Benchmark-main/sgg_benchmark/modeling/roi_heads/box_head/box_head.py", line 53, in forward proposals = [target.copy_with_fields(["labels"]) for target in targets] #, "attributes" TypeError: 'NoneType' object is not iterable

If you are testing in sgdet mode, you have to change the argument MODEL.ROI_RELATION_HEAD.USE_GT_BOX and USE_GT_OBJECT_LABEL to False in your .yaml config file or it won't work.

You don't need the zeroshot_file or informative_file, those are for another research project of mine, don't bother.

Maelic commented 7 months ago

I just made a small push to fix this issue and some others in the code, you can try to pull and reinstall with pip install . to run the latest version.

liw1st commented 7 months ago

When I tested the webcam demo, the following error has occurred: Traceback (most recent call last): File "/media/how/data2/CODES/SGG-Benchmark-main/demo/webcam_demo.py", line 59, in main(config_path, dict_file, weights) File "/media/how/data2/CODES/SGG-Benchmark-main/demo/webcam_demo.py", line 23, in main img, graph = model.predict(frame, visu=True) File "/media/how/data2/CODES/SGG-Benchmark-main/demo/demo_model.py", line 84, in predict predictions = self.model(img_list, targets) File "/home/how/anaconda3/envs/sgbm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, kwargs) File "/media/how/data2/CODES/SGG-Benchmark-main/sgg_benchmark/modeling/detector/generalized_rcnn.py", line 53, in forward x, result, detector_losses = self.roi_heads(features, proposals, targets, logger) File "/home/how/anaconda3/envs/sgbm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, *kwargs) File "/media/how/data2/CODES/SGG-Benchmark-main/sgg_benchmark/modeling/roi_heads/roi_heads.py", line 25, in forward x, detections, loss_box = self.box(features, proposals, targets) File "/home/how/anaconda3/envs/sgbm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(input, kwargs) File "/media/how/data2/CODES/SGG-Benchmark-main/sgg_benchmark/modeling/roi_heads/box_head/box_head.py", line 53, in forward proposals = [target.copy_with_fields(["labels"]) for target in targets] #, "attributes" TypeError: 'NoneType' object is not iterable

If you are testing in sgdet mode, you have to change the argument MODEL.ROI_RELATION_HEAD.USE_GT_BOX and USE_GT_OBJECT_LABEL to False in your .yaml config file or it won't work.

You don't need the zeroshot_file or informative_file, those are for another research project of mine, don't bother.

I'm testing in sgdet mode, and the argument MODEL.ROI_RELATION_HEAD.USE_GT_BOX and USE_GT_OBJECT_LABEL are false. But the result is still wrong (just like the visualizations above). I will try the latest version.

NilBiescas commented 6 months ago

@liw1st Did you manage to get good results ?

Young-Loser commented 1 month ago

当我测试网络摄像头演示时,出现了以下错误:回溯(最近一次调用最后一次):文件“/media/how/data2/CODES/SGG-Benchmark-main/demo/webcam_demo.py”,第 59 行,在 main(config_path、dict_file、weights)中文件“/media/how/data2/CODES/SGG-Benchmark-main/demo/webcam_demo.py”,第 23 行,在 main img 中,graph = model.predict(frame,visu=True)文件“/media/how/data2/CODES/SGG-Benchmark-main/demo/demo_model.py”,第 84 行,在 predict 中预测 = self.model(img_list,targets)文件“/home/how/anaconda3/envs/sgbm2/lib/python3.8/site-packages/torch/nn/modules/module.py”,第 1194 行,在 _call_impl 中返回 forward_call(*input,kwargs)文件“/media/how/data2/CODES/SGG-Benchmark-main/sgg_benchmark/modeling/detector/generalized_rcnn.py”,第 53 行,在前向 x 中,结果,detector_losses = self.roi_heads(features、proposals、targets、logger)文件“/home/how/anaconda3/envs/sgbm2/lib/python3.8/site-packages/torch/nn/modules/module.py”,第 1194 行,在 _call_impl 中返回 forward_call(*input,*kwargs)文件“/media/how/data2/CODES/SGG-Benchmark-main/sgg_benchmark/modeling/roi_heads/roi_heads.py”,第 25 行,在前向 x 中,检测,loss_box = self.box(features、proposals、targets)文件“/home/how/anaconda3/envs/sgbm2/lib/python3.8/site-packages/torch/nn/modules/module.py”,第 1194 行,在 _call_impl 中返回 forward_call(input,kwargs)文件“/media/how/data2/CODES/SGG-Benchmark-main/sgg_benchmark/modeling/roi_heads/box_head/box_head.py”,第 53 行,在前向提案中 = [target.copy_with_fields([“labels”])for target in goals] #,“attributes”TypeError:'NoneType'对象不可迭代

如果您在 sgdet 模式下进行测试,则必须在 .yaml 配置文件中将参数 MODEL.ROI_RELATION_HEAD.USE_GT_BOX 和 USE_GT_OBJECT_LABEL 更改为 False,否则它将不起作用。 您不需要 zeroshot_file 或 informative_file,它们是用于我的另一个研究项目的,不用费心了。

我在 sgdet 模式下测试,参数 MODEL.ROI_RELATION_HEAD.USE_GT_BOX 和 USE_GT_OBJECT_LABEL 为 false。但结果仍然错误(就像上面的可视化一样)。我会尝试最新版本。

您好!看到您的回复,我觉得您应该是中国人。我是山东大学研一新生。我有一些关于这个项目复现的问题想向您请教,我是sgg初学者,想和您进行交流。我的邮箱是1339241893@qq.com,我的微信是:XC-992997,期待得到您的回复,希望与您进行交流学习!