Open XieKaiwen opened 6 months ago
Hi @XieKaiwen, have you fixed the bug about "albumentations"?
@wondervictor previously i pip installed and it worked, the bug disappeared. However because i moved the folder and tried to replicate the virtual environment, this time with albumentations(as can be seen in pip list).
The files were still the same as the previous issue i posted. However the error came back, which is quite confusing
@wondervictor actually nevermind, I totally forgot about the fact that we need a specific version for albumentations for this.
But afterwards I was met with another problem
[rank0]: File "/home/jupyter/til-24-base/vlm/YOLO-World/YOLOvenv/lib/python3.10/site-packages/mmdet/models/detectors/base.py", line 92, in forward
[rank0]: return self.loss(inputs, data_samples)
[rank0]: File "/home/jupyter/til-24-base/vlm/YOLO-World/yolo_world/models/detectors/yolo_world.py", line 30, in loss
[rank0]: img_feats, txt_feats = self.extract_feat(batch_inputs,
[rank0]: File "/home/jupyter/til-24-base/vlm/YOLO-World/yolo_world/models/detectors/yolo_world.py", line 100, in extract_feat
[rank0]: img_feats = self.neck(img_feats, txt_feats)
[rank0]: File "/home/jupyter/til-24-base/vlm/YOLO-World/YOLOvenv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: File "/home/jupyter/til-24-base/vlm/YOLO-World/YOLOvenv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: File "/home/jupyter/til-24-base/vlm/YOLO-World/yolo_world/models/necks/yolo_world_pafpn.py", line 213, in forward
[rank0]: top_down_layer_inputs = torch.cat([upsample_feat, feat_low], 1)
[rank0]: RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 56 but got size 55 for tensor number 1 in the list.
I dont know if this is caused by a problem in my dataset or something?
Hi @XieKaiwen, what is your input shape?
@wondervictor the images in my images file are all width 1520 and height is 870.
Also unrelated but for the MixedGroundingDataset, the bounding boxes are xyxy format right
The shape should be the multiple of 32 and you should pad it to a shape (1536, 896)
@wondervictor just to confirm, the MixedGrounding format, the bboxes should be xyxy format like in pascal_voc right?
be xywh
and xy
is the left-top corner.
@wondervictor so for training the bbox given to the model should be xywh (coco-format) and but when model predicts on data, it will output xyxy?
@wondervictor sorry but i have to confirm that because in the config files, i see alot of "xyxy" and "pascal_voc" being used as the bbox format for like albumentation and box loss, but the bbox format input in annotations for training using MixedGroundingDataset is in coco format?
https://github.com/AILab-CVC/YOLO-World/issues/343#issue-2311998061 - Link to the previous issue i posted about the issue.
The recommended solution to me was to pip install albumentations. However I installed albumentations and this issue appeared again after I recreated my virtual environment, ran
pip install -e .
and installed the requirements in the basic requirements fileHere is the updated pip list: