AakiraOtok / Project_VU

4 stars 1 forks source link

YOWOv3: An Efficient and Generalized Framework for Human Action Detection and Recognition

Temporary Instructions

If you encounter any difficulties or have any questions, feel free to ask in Issues section. I'm here to answer everyone's questions. Thank you sincerely :3.

Currently, the YOWOv3 model and paper are still being finalized, and experiments are continuously being conducted. Therefore, we are unable to provide complete official instructions at this time. However, we will provide a temporary guide, and the comprehensive official instructions will be supplemented and completed in the near future, soon :3. For now, you can:

Note: In the config folder, there are two files: ucf_config.yaml and ava_config.yaml. These are the configuration files for the corresponding datasets. We read information from these config files to build the model and specify hyperparameters and related details. To decide which file to use, simply go to utils/build_config.py and modify the default path in the build_config function (see code below) to the desired file. This can be handled more easily using an argument parser, but as mentioned, the code is currently being used for research purposes, and everything is set up for convenience during experimentation.

def build_config(config_file='config/ucf_config.yaml'):
    with open(config_file, "r") as file:
        config = yaml.load(file, Loader=yaml.SafeLoader)

    if config['active_checker']:
        pass

    return config

Prepare UCF101-24 dataset

Currently, the displayed bounding boxes and labels are not very visually appealing. I will improve them in the near future.

Experimental results:

Note: The Medium-1, Medium-2, and Large models correspond to the 3D backbone models shufflenetv2, i3d, and resnext101, respectively.

UCF101-24

image

AVAv2.2

image

Dataset

Pretrained Model

Instruction

Some notes:

References

I would like to express my sincere gratitude to the following amazing repositories/codes, which were the primary sources I heavily relied on and borrowed code from during the development of this project: