waldo-vision / models

Repository for model development and training
https://waldo.vision
Mozilla Public License 2.0
12 stars 4 forks source link

Implement VideoMAE2 #5

Open jaredb1011 opened 1 year ago

jaredb1011 commented 1 year ago

Description:

Implement the VideoMAE2 model on a dataset of videogame gameplay clips. See the following papers:

Code for VideoMAE2 was recently released

Requirements:

Test the video masked autoencoder model on a small test dataset. Evaluate the model's performance using appropriate metrics (e.g., reconstruction error, SSIM, etc.). Provide proper attribution. Provide clear documentation on how to use the model, including training, evaluation, and any customization options.

Acceptance Criteria:

Successfully implement VideoMAE2 into our codebase. Test the video masked autoencoder model on the prepared dataset, achieving satisfactory performance as indicated by appropriate metrics. Ensure compliance with the license and provide proper attribution. Clear documentation provided on how to use the model, including training, evaluation, and any customization options.

Notes:

Be sure to preprocess the dataset appropriately, including resizing, normalization, and data augmentation if necessary.