LiamWellacott / CDT2019-ERL

1 stars 6 forks source link

Computer Vision #51

Open kini5gowda opened 3 years ago

kini5gowda commented 3 years ago

Summary of high-level computer vision tasks

Functional benchmarks:

Breakdown in tasks for computer vision

Broken down all the tasks into 2 main groups: Object detection using YOLO (hopefully) and person detection-following using. For object detection, using 2D images directly is a lot easier to do. But since we are not given images of the objects to look at, I'm using datasets such as PASCAL VOC and MS COCO.

Using PASCAL VOC, YOLO would be able to detect the following: person, animals (bird, cat, cow, dog, horse, sheep), vehicles (aeroplane, bicycle, boat, bus, car, motorbike, train) and common objects (bottle, chair, dining table, potted plant, sofa, tv/monitor).

Using MS COCO, YOLO would be able to detect 80 different objects. See this for more details.

We could directly then use YOLO to detect a person for functional benchmarks 2 and 3. Once the person is detected, we then follow the person.