Closed ld-xy closed 1 year ago
你好,执行export PYTHONPATH=当前项目路径,试试是否能够成功运行。
可以尝试把train.py从tools文件夹中移出至根目录下,然后执行训练命令,即python -m torch.distributed.launch --nproc_per_node=1 train.py -f configs/damoyolo_tinynasL20_T.py 应该就可以了。
OMG how well the errors are reported in the DAMO-YOLO <3
[~/damo-yolo]
✘ DAMO-YOLO devansh typo-damo - python tools/demo.py video -f /home/devansh/damo-yolo/configs/damoyolo_tinynasL25_S.py --engine ./configs/damo_yolo_s_700+.pth --conf 0.6 --infer_size 640 640 --device cuda --path test.mp4
Inference with torch engine!
2024-05-26 21:05:33.044 | ERROR | __main__:<module>:356 - An error has been caught in function '<module>', process 'MainProcess' (36428), thread 'MainThread' (127212373328960):
Traceback (most recent call last):
> File "tools/demo.py", line 356, in <module>
main()
└ <function main at 0x73b26958fa70>
File "tools/demo.py", line 316, in main
output_dir=args.output_dir, ckpt=args.engine, end2end=args.end2end)
│ │ │ │ │ └ False
│ │ │ │ └ Namespace(camid=0, conf=0.6, config_file='/home/devansh/damo-yolo/configs/damoyolo_tinynasL25_S.py', device='cuda', end2end=F...
│ │ │ └ './configs/damo_yolo_s_700+.pth'
│ │ └ Namespace(camid=0, conf=0.6, config_file='/home/devansh/damo-yolo/configs/damoyolo_tinynasL25_S.py', device='cuda', end2end=F...
│ └ './demo'
└ Namespace(camid=0, conf=0.6, config_file='/home/devansh/damo-yolo/configs/damoyolo_tinynasL25_S.py', device='cuda', end2end=F...
File "tools/demo.py", line 54, in __init__
self.model = self._build_engine(self.config, self.engine_type)
│ │ │ │ │ │ └ 'torch'
│ │ │ │ │ └ <__main__.Infer object at 0x73b2dbddffd0>
│ │ │ │ └ ╒═════════╤══════════════════════════════════════════════════════════════════════════════════╕
│ │ │ │ │ keys │ values ...
│ │ │ └ <__main__.Infer object at 0x73b2dbddffd0>
│ │ └ <function Infer._build_engine at 0x73b26958f5f0>
│ └ <__main__.Infer object at 0x73b2dbddffd0>
└ <__main__.Infer object at 0x73b2dbddffd0>
File "tools/demo.py", line 76, in _build_engine
model.load_state_dict(ckpt['model'], strict=True)
│ │ └ {'epoch': 6, 'model': OrderedDict([('backbone.block_list.0.conv.conv.weight', tensor([[[[ 0.0348, -0.0272, -0.0311],
│ │ ...
│ └ <function Module.load_state_dict at 0x73b2d7391290>
└ Detector(
(backbone): TinyNAS(
(block_list): ModuleList(
(0): Focus(
(conv): ConvBNAct(
(conv):...
File "/home/devansh/anaconda3/envs/DAMO-YOLO/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1052, in load_state_dict
self.__class__.__name__, "\n\t".join(error_msgs)))
│ │ │ └ ['size mismatch for head.gfl_cls.0.weight: copying a param with shape torch.Size([702, 128, 3, 3]) from checkpoint, the shape...
│ │ └ <member '__name__' of 'getset_descriptor' objects>
│ └ <attribute '__class__' of 'object' objects>
└ Detector(
(backbone): TinyNAS(
(block_list): ModuleList(
(0): Focus(
(conv): ConvBNAct(
(conv):...
RuntimeError: Error(s) in loading state_dict for Detector:
size mismatch for head.gfl_cls.0.weight: copying a param with shape torch.Size([702, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([80, 128, 3, 3]).
size mismatch for head.gfl_cls.0.bias: copying a param with shape torch.Size([702]) from checkpoint, the shape in current model is torch.Size([80]).
size mismatch for head.gfl_cls.1.weight: copying a param with shape torch.Size([702, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([80, 256, 3, 3]).
size mismatch for head.gfl_cls.1.bias: copying a param with shape torch.Size([702]) from checkpoint, the shape in current model is torch.Size([80]).
size mismatch for head.gfl_cls.2.weight: copying a param with shape torch.Size([702, 512, 3, 3]) from checkpoint, the shape in current model is torch.Size([80, 512, 3, 3]).
size mismatch for head.gfl_cls.2.bias: copying a param with shape torch.Size([702]) from checkpoint, the shape in current model is torch.Size([80]).
Before Asking
[X] I have read the README carefully. 我已经仔细阅读了README上的操作指引。
[ ] I want to train my custom dataset, and I have read the tutorials for finetune on your data carefully and organize my dataset correctly; 我想训练自定义数据集,我已经仔细阅读了训练自定义数据的教程,以及按照正确的目录结构存放数据集。
[X] I have pulled the latest code of main branch to run again and the problem still existed. 我已经拉取了主分支上最新的代码,重新运行之后,问题仍不能解决。
Search before asking
Question
Traceback (most recent call last): File "tools/torch_inference.py", line 13, in
from damo.base_models.core.ops import RepConv
ModuleNotFoundError: No module named 'damo'
Additional
Traceback (most recent call last): File "tools/torch_inference.py", line 13, in
from damo.base_models.core.ops import RepConv
ModuleNotFoundError: No module named 'damo'
为什么出现没有这个模块啊,这个路径明明是没有问题的啊??望解答