lovelyqian / CDFSOD-benchmark

A benchmark for cross-domain few-shot object detection (ECCV24 paper: Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector)
Apache License 2.0
82 stars 7 forks source link

Fine-tune DeViT using voc_2007_trainval_all3_10shot. #30

Open liniceyo opened 1 week ago

liniceyo commented 1 week ago

Hello, thank you very much for your outstanding work. Your discussion about fine-tuning DeViT and achieving significant performance improvements is truly ingenious. I attempted to fine-tune DeViT myself, but encountered the following issue:

File "/media/A/code/lyl/devit-main/tools/../detectron2/modeling/metaarch/devit.py", line 1182, in forward = torch.gather(feats, 2, indexes[cmask].view(bs, spatial_size, num_classes - 1)) # N x spatial x classes-1 RuntimeError: shape '[1024, 49, 5]' is invalid for input of size 251713

My train_class_weight is as follows: tensor([[ 0.0237, -0.0245, -0.0242, ..., -0.0186, -0.0096, 0.0192], [-0.0787, -0.0277, 0.0089, ..., -0.0130, -0.0020, 0.0158], [ 0.0051, 0.0063, -0.0180, ..., -0.0065, -0.0335, 0.0469], [ 0.0195, 0.0194, -0.0173, ..., 0.0026, -0.0435, 0.0465], [ 0.0136, -0.0001, -0.0225, ..., 0.0150, -0.0459, 0.0157], [ 0.0495, -0.0077, -0.0189, ..., 0.0401, -0.0211, 0.0141]], device='cuda:1')

The dataset has 6 base classes and 3 novel classes . Do you have any suggestions for resolving this issue? I greatly appreciate your response!

AImind commented 1 week ago

Could you provide more details about this error, such as the shapes of the tensors involved?

liniceyo commented 1 week ago

Could you provide more details about this error, such as the shapes of the tensors involved? Thank you very much for your response! I have provided the specific error details, as shown in the image.

This class_weight has always been [6, 1024], but shouldn't it be [9, 1024] since my base classes plus the novel classes total 9? 微信图片_20241117202844 The final error message is: -- Process 1 terminated with the following error: Traceback (most recent call last): File "/home/com/anaconda3/envs/devit/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap fn(i, args) File "/media/A/code/lyl/devit-main/tools/../detectron2/engine/launch.py", line 125, in _distributed_worker main_func(args) File "/media/A/code/lyl/devit-main/tools/train_net.py", line 200, in main return trainer.train() File "/media/A/code/lyl/devit-main/tools/../detectron2/engine/defaults.py", line 496, in train super().train(self.start_iter, self.max_iter) File "/media/A/code/lyl/devit-main/tools/../detectron2/engine/train_loop.py", line 149, in train self.run_step() File "/media/A/code/lyl/devit-main/tools/../detectron2/engine/defaults.py", line 506, in run_step self._trainer.run_step() File "/media/A/code/lyl/devit-main/tools/../detectron2/engine/train_loop.py", line 273, in run_step loss_dict = self.model(data) File "/home/com/anaconda3/envs/devit/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, kwargs) File "/home/com/anaconda3/envs/devit/lib/python3.9/site-packages/torch/nn/parallel/distributed.py", line 1040, in forward output = self._run_ddp_forward(*inputs, *kwargs) File "/home/com/anaconda3/envs/devit/lib/python3.9/site-packages/torch/nn/parallel/distributed.py", line 1000, in _run_ddp_forward return module_to_run(inputs[0], kwargs[0]) File "/home/com/anaconda3/envs/devit/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/media/A/code/lyl/devit-main/tools/../detectron2/modeling/metaarch/devit.py", line 1182, in forward = torch.gather(feats, 2, indexes[cmask].view(bs, spatial_size, num_classes - 1)) # N x spatial x classes-1 RuntimeError: shape '[1024, 49, 5]' is invalid for input of size 251468

Thank you once again for your excellent work and your response.

AImind commented 1 week ago

Since our model is pre-trained on COCO, we assume that the base classes correspond to COCO's classes, while the novel classes include all the classes from the new datasets. In the k-way k-shot setting, we train the model using k-shot instances and evaluate it on the test datasets.