Open liniceyo opened 1 week ago
Could you provide more details about this error, such as the shapes of the tensors involved?
Could you provide more details about this error, such as the shapes of the tensors involved? Thank you very much for your response! I have provided the specific error details, as shown in the image.
This class_weight has always been [6, 1024], but shouldn't it be [9, 1024] since my base classes plus the novel classes total 9? The final error message is: -- Process 1 terminated with the following error: Traceback (most recent call last): File "/home/com/anaconda3/envs/devit/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap fn(i, args) File "/media/A/code/lyl/devit-main/tools/../detectron2/engine/launch.py", line 125, in _distributed_worker main_func(args) File "/media/A/code/lyl/devit-main/tools/train_net.py", line 200, in main return trainer.train() File "/media/A/code/lyl/devit-main/tools/../detectron2/engine/defaults.py", line 496, in train super().train(self.start_iter, self.max_iter) File "/media/A/code/lyl/devit-main/tools/../detectron2/engine/train_loop.py", line 149, in train self.run_step() File "/media/A/code/lyl/devit-main/tools/../detectron2/engine/defaults.py", line 506, in run_step self._trainer.run_step() File "/media/A/code/lyl/devit-main/tools/../detectron2/engine/train_loop.py", line 273, in run_step loss_dict = self.model(data) File "/home/com/anaconda3/envs/devit/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, kwargs) File "/home/com/anaconda3/envs/devit/lib/python3.9/site-packages/torch/nn/parallel/distributed.py", line 1040, in forward output = self._run_ddp_forward(*inputs, *kwargs) File "/home/com/anaconda3/envs/devit/lib/python3.9/site-packages/torch/nn/parallel/distributed.py", line 1000, in _run_ddp_forward return module_to_run(inputs[0], kwargs[0]) File "/home/com/anaconda3/envs/devit/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/media/A/code/lyl/devit-main/tools/../detectron2/modeling/metaarch/devit.py", line 1182, in forward = torch.gather(feats, 2, indexes[cmask].view(bs, spatial_size, num_classes - 1)) # N x spatial x classes-1 RuntimeError: shape '[1024, 49, 5]' is invalid for input of size 251468
Thank you once again for your excellent work and your response.
Since our model is pre-trained on COCO, we assume that the base classes correspond to COCO's classes, while the novel classes include all the classes from the new datasets. In the k-way k-shot setting, we train the model using k-shot instances and evaluate it on the test datasets.
Hello, thank you very much for your outstanding work. Your discussion about fine-tuning DeViT and achieving significant performance improvements is truly ingenious. I attempted to fine-tune DeViT myself, but encountered the following issue:
File "/media/A/code/lyl/devit-main/tools/../detectron2/modeling/metaarch/devit.py", line 1182, in forward = torch.gather(feats, 2, indexes[cmask].view(bs, spatial_size, num_classes - 1)) # N x spatial x classes-1 RuntimeError: shape '[1024, 49, 5]' is invalid for input of size 251713
My train_class_weight is as follows: tensor([[ 0.0237, -0.0245, -0.0242, ..., -0.0186, -0.0096, 0.0192], [-0.0787, -0.0277, 0.0089, ..., -0.0130, -0.0020, 0.0158], [ 0.0051, 0.0063, -0.0180, ..., -0.0065, -0.0335, 0.0469], [ 0.0195, 0.0194, -0.0173, ..., 0.0026, -0.0435, 0.0465], [ 0.0136, -0.0001, -0.0225, ..., 0.0150, -0.0459, 0.0157], [ 0.0495, -0.0077, -0.0189, ..., 0.0401, -0.0211, 0.0141]], device='cuda:1')
The dataset has 6 base classes and 3 novel classes . Do you have any suggestions for resolving this issue? I greatly appreciate your response!