kbelenky / open_sorts

Sorting machine for TCGs such as Magic: The Gathering
GNU General Public License v3.0
18 stars 2 forks source link

Documentation for generating Tensorflow model #1

Open MichaelEvans opened 2 years ago

MichaelEvans commented 2 years ago

How is the Tensorflow model for the card recognition generated?

kbelenky commented 2 years ago

I'll give you an overview of how it's generated (and I'll clean up the overview and put it in the README for posterity).

However, I don't think I should share the code for generating it. The code requires scraping card images from Scryfall. Their developer page makes it clear they're friendly to being scraped in a responsible manner. However, I'd like to clear it with them before I distribute a tool that does the scraping.

The tensorflow model is an embedding model, trained using a triplet loss function: https://en.wikipedia.org/wiki/Triplet_loss

It uses a headless MobileNetV2, hooked up to a simple 128-dimension, activation-free layer, followed by an L2-norm layer. It's very similar to what they did here: https://www.tensorflow.org/addons/tutorials/losses_triplet

For training, it's a matter of taking the very pristine, high quality scans and using a witches' brew of random augmentations to make the cards look like low-quality webcam images taken in adverse lighting conditions, and feeding them in to TensorFlow's training.

The technical challenges were:

  1. Finding the right set of augmentations that would work.
  2. Extreme batch sizes. The TripletSemiHardLoss and TripletHardLoss functions work best in this scenario with a batch size of at least 2000, with 4000 being even better. I don't know of any consumer-grade hardware that can handle batch sizes that large without extreme software tweaks. I opted instead to use Google Colab's TPU support to deliver what I needed.

The good news is, the model doesn't really need to be retrained ever again. Whenever a new set comes out, I only need to update embedding_dictionary.pickle, which only takes a couple minutes on cheap hardware (but does require the scraped Scryfall images).