Open fanghaook opened 1 year ago
Thanks for your attention.
I tried retraining to reproduce the author's results, but the results were not satisfactory. The version of YTVIS2019_ResNet50 only had 51-52AP, far from reaching 55AP. Let me briefly describe the training process:
python train_ctvis.py
, an error occurred:
ImportError: /home/.local/lib/python3.10/site-packages/MultiScaleDeformableAttention-1.0-py3.10-linux-x86_64.egg/MultiScaleDeformableAttention.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops4view4callERKNS_6TensorEN3c108ArrayRefIlEE
I guess there are still some incompatibilities in the environment version provided by the author. The author can try creating a new environment to see if it is compatible.mask2former/modeling/pixel_decoder/ops/build
directory. This command works well for me.Hi @fanghaook ,
When we prepared our submission, we only set the SOLVER.IMS_PER_BATCH
as 16 empirically. We also found locally that using small batches leads to extremely unstable results, and may even drop 2~3 AP. We are finding the best config setting for small batch sizes or implementing the gradient accumulation to simulate identical batch sizes in limited GPUs.
CTVIS/mask2former/modeling/matcher.py", line 111, in memory_efficient_forward cost_class = -out_prob[:, tgt_ids] IndexError: tensors used as indices must be long, int, byte or bool tensors