microsoft / VoTT

Visual Object Tagging Tool: An electron app for building end to end Object Detection Models from Images and Videos.
MIT License
4.29k stars 837 forks source link

Person Tracking #1026

Open mgarbade opened 3 years ago

mgarbade commented 3 years ago

I'd like to label a 10 min video with 30 FPS with human bounding boxes which are tracked over time. Obviously this is a lot of manual work an human trackers are already publically available I wanted to use my own human tracker and export json annotations, that can later be read by VoTT.

Apparently VoTT creates a series of some-hash-asset.json files which contain bounding box information.

{
    "asset": {
        "id": "4fc0089db2ceb83fd1ab24a0bf042a31",
        "format": "12",
        "state": 2,
        "type": 3,
        "name": "myvideo.mp4#t=7.12",
        "path": "file:/path/to/myvideo.mp4#t=7.12",
        "size": {
            "width": 1920,
            "height": 1080
        },
        "parent": {
            "format": "mp4",
            "id": "fb2ead680aa141ec77b4ab48effbdd32",
            "name": "myvideo.mp4",
            "path": "file:/path/to/myvideo.mp4",
            "size": {
                "width": 1920,
                "height": 1080
            },
            "state": 1,
            "type": 2
        },
        "timestamp": 7.12
    },
    "regions": [
        {
            "id": "MosU9KAH8",
            "type": "RECTANGLE",
            "tags": [
                "Luke"
            ],
            "boundingBox": {
                "height": 1076.0907335907336,
                "width": 536.2662807525326,
                "left": 951.6642547033284,
                "top": 0
            },
            "points": [
                {
                    "x": 951.6642547033284,
                    "y": 0
                },
                {
                    "x": 1487.930535455861,
                    "y": 0
                },
                {
                    "x": 1487.930535455861,
                    "y": 1076.0907335907336
                },
                {
                    "x": 951.6642547033284,
                    "y": 1076.0907335907336
                }
            ]
        },
        {
            "id": "ZTfIUpTZXO",
            "type": "RECTANGLE",
            "tags": [
                "Lea"
            ],
            "boundingBox": {
                "height": 779.7683397683398,
                "width": 320.92619392185236,
                "left": 690.4775687409551,
                "top": 4.430501930501931
            },
            "points": [
                {
                    "x": 690.4775687409551,
                    "y": 4.430501930501931
                },
                {
                    "x": 1011.4037626628075,
                    "y": 4.430501930501931
                },
                {
                    "x": 1011.4037626628075,
                    "y": 784.1988416988418
                },
                {
                    "x": 690.4775687409551,
                    "y": 784.1988416988418
                }
            ]
        }
    ],
    "version": "2.2.0"
}

and in the myprojectname.vott there is a field called 'assets' which contains all the above asset filenames together with a time stamp.

Is it enough for me to create these two items for importing tracking information?

If there is already some mechanism to important tracking information or to track humans and I just couldn't find it please let me know.