ultralytics / yolov5

YOLOv5 šŸš€ in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
51.18k stars 16.43k forks source link

Hard coded 'val' key in val.py #4635

Closed robin-maillot closed 3 years ago

robin-maillot commented 3 years ago

Before submitting a bug report, please be aware that your issue must be reproducible with all of the following, otherwise it is non-actionable, and we can not help you:

If this is a custom dataset/training question you must include your train*.jpg, val*.jpg and results.png figures, or we can not help you. You can generate these with utils.plot_results().

šŸ› Bug

A clear and concise description of what the bug is.

When running val.py:run() the is_coco variable is calculated based on the val key and not the task key.

This means when running python val.py --data data/coco128.yaml --task test

To Reproduce (REQUIRED)

Input:

Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]

path: ../datasets/coco128 # dataset root dir train: images/train2017 # train images (relative to 'path') 128 images

val: images/train2017 # val images (relative to 'path') 128 images

test: images/train2017 # test images (optional)

Classes

nc: 80 # number of classes names: ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush'] # class names

Download script/URL (optional)

download: https://github.com/ultralytics/yolov5/releases/download/v1.0/coco128.zip


- Run : `python val.py --data data/coco128.yaml --task test`

Output:

Traceback (most recent call last): File "val.py", line 354, in main(opt) File "val.py", line 329, in main run(*vars(opt)) File "D:\Nanovare\dev.yolov5-venv\lib\site-packages\torch\autograd\grad_mode.py", line 28, in decorate_context return func(args, **kwargs) File "val.py", line 137, in run is_coco = type(data['val']) is str and data['val'].endswith('coco/val2017.txt') # COCO dataset KeyError: 'val'



## Expected behavior

A clear and concise description of what you expected to happen.

`is_coco` should be calculated based on `task` key and not hard coded `val` key

## Environment

If applicable, add screenshots to help explain your problem.

- OS: Windows10
- GPU: NVIDIA GeForce RTX 2060

## Additional context

Add any other context about the problem here.
glenn-jocher commented 3 years ago

@robin-maillot good news šŸ˜ƒ! Your original issue may now be fixed āœ… in PR #4642. This PR uses the safer .get() method to retrieve the val key, and will return None by default if val key is missing, setting is_coco=False.

To receive this update:

Thank you for spotting this issue and informing us of the problem. Please let us know if this update resolves the issue for you, and feel free to inform us of any other issues you discover or feature requests that come to mind. Happy trainings with YOLOv5 šŸš€!

robin-maillot commented 3 years ago

@glenn-jocher seems good but in the case that we are not training should it not be the task key instead of val?

Because later on we use the data['task'] to create the dataloader:

    # Dataloader
    if not training:
        if device.type != 'cpu':
            model(torch.zeros(1, 3, imgsz, imgsz).to(device).type_as(next(model.parameters())))  # run once
        task = task if task in ('train', 'val', 'test') else 'val'  # path to train/val/test images
        dataloader = create_dataloader(data[task], imgsz, batch_size, gs, single_cls, pad=0.5, rect=True,
                                       prefix=colorstr(f'{task}: '))[0]

I feel like in this case it makes sense to use something like:

is_coco = isinstance(data.get('task'), str) and data['val'].endswith('coco/val2017.txt') # COCO dataset

Since I do not use coco anyways the fix you provided solves my issues, thanks :)

I can submit a PR directly if you prefer to review it that way?

glenn-jocher commented 3 years ago

@robin-maillot well the data dict for coco.yaml will always have a val key regardless of task = 'train|val|test', so in this line we are only trying to establish if the dataset is the official COCO dataset. The user should still be able to run python val.py --task test and everything will work correctly I think.

glenn-jocher commented 3 years ago

@robin-maillot also remember sometimes task can also be set to speed or study to reproduce README plots and tables. https://github.com/ultralytics/yolov5/blob/fad57c29cd27c0fcbc0038b7b7312b9b6ef922a8/val.py#L303

robin-maillot commented 3 years ago

@glenn-jocher makes sense, I was thinking of transfer learning cases where one might want to train/validate on some other dataset but test on coco, this might be an edge case that is better solved using the speed or study options like you mention.

Thanks for taking the time to solve the bug :)