dylandru / BaseballCV

A collection of tools and models designed to aid in the use of Computer Vision in baseball
MIT License
38 stars 10 forks source link

Create Florence 2 model using the baseball_rubber_home_glove dataset. #16

Open camarcano opened 1 month ago

dylandru commented 3 hours ago

Florence2 is a Vision Language Model designed for various tasks, including Region Description (basically object detection... captioning for a given part of an image). As an alternative to the current YOLO methods implemented in the repository, a Florence2 model fine-tuned on one of our available datasets will be implemented as well as functions for the user to more easily fine-tune the model based on similar YOLO datasets. To read more about the possible uses of Florence2, please consult the following links:

Blog

Paper