Open jaredb1011 opened 1 year ago
Note: we had a previous attempt at clip segmentation for the first prototype: https://github.com/waldo-vision/aimbot-detection-prototype/tree/main/clip_creator
I think a good starting point would be to use some kind of text-detection model to detect and read text in the images. Text that we know occurs in the option menus/endscreens/startscreens can be detected and tagged as not part of the clip. This way we can also modify the banned text through a config file and add known menu text for games like overwatch etc. This can be considered kind of hard problem and we want to prevent scope creep so we probably shouldn't over-engineer a solution to this atm. I'd be happy to work on this.
I think a good starting point would be to use some kind of text-detection model to detect and read text in the images. Text that we know occurs in the option menus/endscreens/startscreens can be detected and tagged as not part of the clip. This way we can also modify the banned text through a config file and add known menu text for games like overwatch etc.
This can be considered kind of hard problem and we want to prevent scope creep so we probably shouldn't over-engineer a solution to this atm.
I'd be happy to work on this.
This library looks suitable for the OCR part https://github.com/PaddlePaddle/PaddleOCR
One thing we have to consider is that the submitted videos won't all be in English, so we might have to get UI texts in all supported languages. Another idea is to use OCR to get a rough sample dataset, then train a image classifier on this dataset and hopefully it generalizes to all languages.
I'm happy to work on this as well.
OCR might be helpful because kills are recorded in the feed, and we could coordinate the username of the player to detect kills. We hand-coded a solution last year, but it was resolution specific.
I think OpenCV template matching can achieve this. Use template matching to distinguish loading screen or gameplay screen.
I feel like object detection with yolo for example would be a more dynamic option to excluding any menus.
Yeah, like if there are logos or symbols within menus that are not dependent on a player's region/language, I think that'd be the most reliable way to detect it.
Haitch, idk if you have done any work on this, but I would love to coordinate. I'm gonna start trying some stuff.
Description:
Develop code that takes a gameplay video as input, segments it into smaller clips of no longer than 30 seconds, and trims irrelevant sections such as menus, intros, and outros. The code should be designed in a modular fashion to accommodate game-specific features, allowing it to work with various games.
Requirements:
Acceptance Criteria:
Notes:
Since game-specific features may vary, it is suggested to create a basic solution first and then incrementally add support for different games as needed. Consider using machine learning techniques, such as computer vision or deep learning, to detect and trim irrelevant sections with higher accuracy. For better compatibility, consider using open-source libraries and tools for video processing, such as OpenCV, FFmpeg, or similar.