czczup / ViT-Adapter

[ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions
https://arxiv.org/abs/2205.08534
Apache License 2.0
1.27k stars 140 forks source link

Why BEiT_Adapter only have large size? #133

Open cljhwt opened 1 year ago

cljhwt commented 1 year ago

Hi ! The research in this paper is an excellent work

By the way, I would like to ask why there are only BEiT_Adapter configs files with large scale whithout it's base scale results. And i manually wrote a config file of BEiT_Adapter_base on ade20k, but it's mIoU is just 53.54, while the origin BEiT_mask2former_base is 54.58. Why the results of base size didn't show in your paper? Does the performance of Adapter in base size with BEiT is not ideal ?

czczup commented 1 year ago

Apologies for the delay. I haven't tested BEiT_Adapter_base on ADE20K before, but I've just set up the experiment. I'll inform you as soon as the results are available.

Thank you for your patience.

czczup commented 1 year ago

This is my result of BEiT-Adapter-Base + Mask2Former. Note that the dimension of Mask2Former used in this experiment is 256. Could you provide your config of BEiT_mask2former_base for further alignment?

2023-09-30 11:56:04,746 - mmseg - INFO - Iter(val) [250]        aAcc: 0.8501, mIoU: 0.5465, mAcc: 0.6819, IoU.wall: 0.7994, IoU.building: 0.8281, IoU.sky: 0.9472, IoU.floor: 0.8395, IoU.tree: 0.7659, IoU.ceiling: 0.8623, IoU.road: 0.8715, IoU.bed : 0.9187, IoU.windowpane: 0.6375, IoU.grass: 0.6844, IoU.cabinet: 0.6505, IoU.sidewalk: 0.6945, IoU.person: 0.8505, IoU.earth: 0.4393, IoU.door: 0.5317, IoU.table: 0.6926, IoU.mountain: 0.6409, IoU.plant: 0.5695, IoU.curtain: 0.7611, IoU.chair: 0.6512, IoU.car: 0.8682, IoU.water: 0.5808, IoU.painting: 0.7907, IoU.sofa: 0.7652, IoU.shelf: 0.4546, IoU.house: 0.4702, IoU.sea: 0.6439, IoU.mirror: 0.7254, IoU.rug: 0.6517, IoU.field: 0.3357, IoU.armchair: 0.5011, IoU.seat: 0.6643, IoU.fence: 0.4827, IoU.desk: 0.5617, IoU.rock: 0.5415, IoU.wardrobe: 0.5447, IoU.lamp: 0.7120, IoU.bathtub: 0.8853, IoU.railing: 0.4040, IoU.cushion: 0.6541, IoU.base: 0.3901, IoU.box: 0.3002, IoU.column: 0.5051, IoU.signboard: 0.4119, IoU.chest of drawers: 0.4841, IoU.counter: 0.3723, IoU.sand: 0.5475, IoU.sink: 0.7777, IoU.skyscraper: 0.4806, IoU.fireplace: 0.7571, IoU.refrigerator: 0.7965, IoU.grandstand: 0.5039, IoU.path: 0.3421, IoU.stairs: 0.3439, IoU.runway: 0.7621, IoU.case: 0.6224, IoU.pool table: 0.9437, IoU.pillow: 0.6268, IoU.screen door: 0.6726, IoU.stairway: 0.3060, IoU.river: 0.1128, IoU.bridge: 0.7292, IoU.bookcase: 0.3616, IoU.blind: 0.3939, IoU.coffee table: 0.6777, IoU.toil
et: 0.8649, IoU.flower: 0.5162, IoU.book: 0.5185, IoU.hill: 0.1149, IoU.bench: 0.4603, IoU.countertop: 0.6444, IoU.stove: 0.7822, IoU.palm: 0.5601, IoU.kitchen island: 0.5189, IoU.computer: 0.6937, IoU.sw
ivel chair: 0.4552, IoU.boat: 0.3621, IoU.bar: 0.5614, IoU.arcade machine: 0.5609, IoU.hovel: 0.4454, IoU.bus: 0.9196, IoU.towel: 0.7695, IoU.light: 0.6364, IoU.truck: 0.3794, IoU.tower: 0.2620, IoU.chand
elier: 0.7236, IoU.awning: 0.3449, IoU.streetlight: 0.3721, IoU.booth: 0.6130, IoU.television receiver: 0.7139, IoU.airplane: 0.7217, IoU.dirt track: 0.2041, IoU.apparel: 0.3526, IoU.pole: 0.3204, IoU.lan
d: 0.0352, IoU.bannister: 0.2305, IoU.escalator: 0.5250, IoU.ottoman: 0.4508, IoU.bottle: 0.4364, IoU.buffet: 0.4752, IoU.poster: 0.3954, IoU.stage: 0.2534, IoU.van: 0.4732, IoU.ship: 0.0261, IoU.fountain
: 0.2786, IoU.conveyer belt: 0.8157, IoU.canopy: 0.4949, IoU.washer: 0.8106, IoU.plaything: 0.3310, IoU.swimming pool: 0.6059, IoU.stool: 0.4643, IoU.barrel: 0.6915, IoU.basket: 0.4306, IoU.waterfall: 0.6
832, IoU.tent: 0.9524, IoU.bag: 0.1576, IoU.minibike: 0.7236, IoU.cradle: 0.8883, IoU.oven: 0.6189, IoU.ball: 0.4737, IoU.food: 0.6376, IoU.step: 0.1886, IoU.tank: 0.5566, IoU.trade name: 0.2809, IoU.micr
owave: 0.8365, IoU.pot: 0.4993, IoU.animal: 0.5931, IoU.bicycle: 0.5864, IoU.lake: 0.5583, IoU.dishwasher: 0.6504, IoU.screen: 0.5844, IoU.blanket: 0.2480, IoU.sculpture: 0.5268, IoU.hood: 0.7523, IoU.sco
nce: 0.5534, IoU.vase: 0.4784, IoU.traffic light: 0.4014, IoU.tray: 0.2017, IoU.ashcan: 0.4153, IoU.fan: 0.6863, IoU.pier: 0.5308, IoU.crt screen: 0.0162, IoU.plate: 0.5838, IoU.monitor: 0.1086, IoU.bulle
tin board: 0.3701, IoU.shower: 0.0269, IoU.radiator: 0.6873, IoU.glass: 0.2389, IoU.clock: 0.4590, IoU.flag: 0.5068, Acc.wall: 0.8778, Acc.building: 0.9112, Acc.sky: 0.9744, Acc.floor: 0.9155, Acc.tree: 0
.8762, Acc.ceiling: 0.9326, Acc.road: 0.9177, Acc.bed : 0.9644, Acc.windowpane: 0.7864, Acc.grass: 0.8285, Acc.cabinet: 0.7794, Acc.sidewalk: 0.8323, Acc.person: 0.9319, Acc.earth: 0.5723, Acc.door: 0.691
4, Acc.table: 0.8214, Acc.mountain: 0.8145, Acc.plant: 0.6980, Acc.curtain: 0.8805, Acc.chair: 0.7807, Acc.car: 0.9346, Acc.water: 0.7036, Acc.painting: 0.9076, Acc.sofa: 0.8962, Acc.shelf: 0.6070, Acc.ho
use: 0.7301, Acc.sea: 0.8402, Acc.mirror: 0.8304, Acc.rug: 0.7756, Acc.field: 0.5505, Acc.armchair: 0.6909, Acc.seat: 0.8507, Acc.fence: 0.6493, Acc.desk: 0.7741, Acc.rock: 0.7382, Acc.wardrobe: 0.7427, A
cc.lamp: 0.8249, Acc.bathtub: 0.9267, Acc.railing: 0.5550, Acc.cushion: 0.7764, Acc.base: 0.6289, Acc.box: 0.4274, Acc.column: 0.6184, Acc.signboard: 0.5792, Acc.chest of drawers: 0.6791, Acc.counter: 0.4
979, Acc.sand: 0.7746, Acc.sink: 0.8375, Acc.skyscraper: 0.5909, Acc.fireplace: 0.9410, Acc.refrigerator: 0.8982, Acc.grandstand: 0.7866, Acc.path: 0.5065, Acc.stairs: 0.4440, Acc.runway: 0.9675, Acc.case
: 0.8061, Acc.pool table: 0.9735, Acc.pillow: 0.7687, Acc.screen door: 0.7145, Acc.stairway: 0.4642, Acc.river: 0.2700, Acc.bridge: 0.8449, Acc.bookcase: 0.5272, Acc.blind: 0.4623, Acc.coffee table: 0.832
7, Acc.toilet: 0.9121, Acc.flower: 0.6664, Acc.book: 0.7746, Acc.hill: 0.1933, Acc.bench: 0.5640, Acc.countertop: 0.7720, Acc.stove: 0.8853, Acc.palm: 0.8074, Acc.kitchen island: 0.8554, Acc.computer: 0.7
741, Acc.swivel chair: 0.6416, Acc.boat: 0.5236, Acc.bar: 0.6841, Acc.arcade machine: 0.5941, Acc.hovel: 0.4681, Acc.bus: 0.9659, Acc.towel: 0.8593, Acc.light: 0.7791, Acc.truck: 0.5500, Acc.tower: 0.5328
, Acc.chandelier: 0.8545, Acc.awning: 0.4605, Acc.streetlight: 0.5291, Acc.booth: 0.7207, Acc.television receiver: 0.8858, Acc.airplane: 0.8224, Acc.dirt track: 0.3484, Acc.apparel: 0.5060, Acc.pole: 0.45
99, Acc.land: 0.0676, Acc.bannister: 0.3652, Acc.escalator: 0.7185, Acc.ottoman: 0.6483, Acc.bottle: 0.5931, Acc.buffet: 0.5879, Acc.poster: 0.5286, Acc.stage: 0.3521, Acc.van: 0.6482, Acc.ship: 0.0377, A
cc.fountain: 0.2830, Acc.conveyer belt: 0.9226, Acc.canopy: 0.6828, Acc.washer: 0.8371, Acc.plaything: 0.4871, Acc.swimming pool: 0.7623, Acc.stool: 0.7517, Acc.barrel: 0.7413, Acc.basket: 0.6218, Acc.wat
erfall: 0.9466, Acc.tent: 0.9803, Acc.bag: 0.2122, Acc.minibike: 0.8847, Acc.cradle: 0.9720, Acc.oven: 0.6964, Acc.ball: 0.5264, Acc.food: 0.8282, Acc.step: 0.3076, Acc.tank: 0.6555, Acc.trade name: 0.341
4, Acc.microwave: 0.9140, Acc.pot: 0.6053, Acc.animal: 0.6218, Acc.bicycle: 0.8026, Acc.lake: 0.6362, Acc.dishwasher: 0.7833, Acc.screen: 0.9112, Acc.blanket: 0.3279, Acc.sculpture: 0.8812, Acc.hood: 0.79
95, Acc.sconce: 0.7021, Acc.vase: 0.7047, Acc.traffic light: 0.6322, Acc.tray: 0.3209, Acc.ashcan: 0.6108, Acc.fan: 0.8137, Acc.pier: 0.8272, Acc.crt screen: 0.0552, Acc.plate: 0.7243, Acc.monitor: 0.1317
, Acc.bulletin board: 0.6315, Acc.shower: 0.2370, Acc.radiator: 0.8311, Acc.glass: 0.2797, Acc.clock: 0.6285, Acc.flag: 0.5645
cljhwt commented 1 year ago

Thanks to your reply! I uploaded my training results on https://github.com/cljhwt/Adapter/tree/4e36401b75c3a97bfe3016e978d6ffc4af02fc06 In which contains BEiT_mask2former_base configs and logs on 2 mmseg versions v1.0.0 and old 0.20.2. Due to limited experimental conditions, I trained these 2 models with batch_size=1 on 2 RTX3090. Could you provide your batch_size and training condictions of your previous base result?

cljhwt commented 11 months ago

Thanks to your reply! Could you please provide your BEiT-Adapter-Base + Mask2Former config file for my further alignment? Thanks