convert the format of the caltech pedestrian dataset to the format that yolo uses

This repo is adapted from

dependencies

opencv
numpy
scipy

how to

Convert the .seq video files to .png frames by running $ python generate-images.py. They will end up in the images folder.
Squared images work better, which is why you can convert the 640x480 frames to 640x640 frames by running $ python squarify-images.py
Convert the .vbb annotation files to .txt files by running $ python generate-annotation.py. It will create the labels folder that contains the .txt files named like the frames and the train.txt and test.txt files that contain the paths to the images.
Adjust .data yolo file
Adjust .cfg yolo file: take e.g. yolo-voc.2.0.cfg and set height = 640, width = 640, classes = 2, and in the final layer filters = 35 (= (classes + 5) * 5))

folder structure

|- caltech
|-- annotations
|-- test06
|--- V000.seq
|--- ...
|-- ...
|-- train00
|-- ...
|- caltech-for-yolo (this repo, cd)
|-- generate-images.py
|-- generate-annotation.py
|-- images
|-- labels
|-- test.txt
|-- train.txt

simonzachau / caltech-pedestrian-dataset-to-yolo-format-converter

readme

convert the format of the caltech pedestrian dataset to the format that yolo uses

dependencies

how to

folder structure