Open phanikumarmalladi opened 1 year ago
Frist make a directory containing all frames as images, organized however you want. I would suggest something like this:
> my_dataset
> VID_0001
> IMG_0001.png
> IMG_002.png
> ...
> VID_0002
> ...
> ...
I would suggest using a symlink.
Then, in the folder `data/vid/annotations/make a
.jsonfile and call it whatever you want. The file should be structured like [MS COCO](https://www.immersivelimit.com/tutorials/create-coco-annotations-from-scratch). For bounding box detection, this structure worked for me. There are some fields like
iscrowd`` that aren't used but I kept them in just-in-case.
{
"categories": {
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {"type": "integer"},
"name": {"type": "string"},
"encoded_name": {"type": "string"}
}
}
},
"videos" : {
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {"type": "integer"},
"name": {"type": "string"},
"vid_train_frames": {"const": []}
}
}
},
"images" : {
"type": "array",
"items": {
"type": "object",
"properties": {
"file_name": {"type": "string"},
"id": {"type": "integer"},
"height": {"type": "integer"},
"width": {"type": "integer"},
"frame_id": {"type": "integer"},
"video_id": {"type": "integer"},
"is_vid_train_frame": {"const": false},
}
}
},
"annotations" : {
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {"type": "integer"},
"image_id": {"type": "integer"},
"video_id": {"type": "integer"},
"instance_id": {"type": "integer"},
"area": {"type": "integer"},
"iscrowd": {"const": false},
"occluded": {"const": -1},
"generated": {"const": -1},
}
}
}
}
You will have to ensure unique IDs for the images, videos, and annotations. Additionally, each image should point to the location of where you put the images in the first step.
To make the model actually read the .json
file for the singleframe baseline, add keys to the PATHS
dictionary at the bottom of datasets/vid_single.py
:
PATHS = {
...
"custom_dataset": [(root / "Data" / "DSTG", root / "annotations/zoomed" / 'custom_dataset.json')],
}
Add a transformation to make_coco_transform(image_set)
:
if image_set == 'custom_dataset':
return T.Compose([
T.RandomHorizontalFlip(),
T.RandomResize([600], max_size=1000),
normalize,
])
And in main.py
, you can change image_set
to "custom_dataset"
for ``dataset_train```:
dataset_train = build_dataset(image_set="custom_dataset", args=args)
I would suggest doing this twice for your training and validation set. In the above step, you could equally do it for dataset_vid
.
Also if you are working on this project long-term I would suggest looking into the command line args
so you aren't hardcoding main.py
.
my_dataset VID_0001 IMG_0001.png IMG_002.png ... VID_0002 ... ... I want know where to put my_dataset ,thank you!
Hi @itbergl , could you please share the script used for generating the json annotation file?
Please let me know how to train it on a custom dataset and the necessary structure of the dataset.