compute resources for bespoke classifier training

nathanielrindlaub commented 1 year ago

I recently stepped through your classifier training workflow in an AWS SageMaker Studio Lab instance, and was able to begin fitting and efficientnet-b3 with my own data, but I quickly exhausted the available memory (15GB) and then later disk space (25GB). I think SageMaker Studio Lab is geared towards learning ML and running some simple experiments–it's also free–so it's not terribly surprising that I maxed it out right out of the gate. That said, before I start shopping around for a new classifier training environment, do you happen to have benchmarks on how much memory and disk space the classifier training process will consume?

agentmorris commented 1 year ago

Sorry, the best answer is "we don't know". The last time that classifier training pipeline was run, it was likely on an Azure NC6v3 instance, and we probably never tested it on anything smaller, and definitely never tested it without a GPU. Here are some random facts in random order that may be helpful:

That code is pretty old now, and was fairly tied to some database infrastructure that existed only at Microsoft, so you may find it easier to start from scratch, i.e., run MegaDetector, crop the detections into separate images, and pick your favorite classifier training tutorial from 2023 to train a model. Other stuff in this repo that works with classification results - e.g. our postprocessing/preview scripts - only cares that you store your output in the MegaDetector output format, which supports classification information.
In fact, you may find it easier to fine-tune MegaDetector to add new classes, by following the YOLOv5 "train custom data" tutorial.
The only external user that I'm aware of who has run this pipeline (mostly) is the team at TrapTagger, and I believe they also trained on AWS. They have a high-level overview of what they did here. They are active on the AI for Conservation Slack; you may want to ping them for advice.

Sorry we don't have an easier answer!

nathanielrindlaub commented 1 year ago

Amazing, no this is all super helpful. Thank you @agentmorris!!

microsoft / CameraTraps

compute resources for bespoke classifier training #333