drivendataorg / zamba

A Python package for identifying 42 kinds of animals, training custom models, and estimating distance from camera trap videos
https://zamba.drivendata.org/docs/stable/
MIT License
118 stars 27 forks source link

Automated workflow for publishing models #140

Closed ejm714 closed 3 years ago

ejm714 commented 3 years ago

Adds a script and make command make publish_models that takes a directory with weights and yaml files and from there creates official configs, copies yaml and json files to appropriate folders, and uploads the model to s3.

A hash of the config is used in the public model name to ensure models and configs are appropriately linked.

Closes https://github.com/drivendataorg/zamba-algorithms/issues/517

github-actions[bot] commented 3 years ago

🚀 Deployed on https://deploy-preview-140--silly-keller-664934.netlify.app

codecov[bot] commented 3 years ago

Codecov Report

:exclamation: No coverage uploaded for pull request base (v2@c760b55). Click here to learn what that means. The diff coverage is n/a.

@@         Coverage Diff          @@
##             v2    #140   +/-   ##
====================================
  Coverage      ?   85.5%           
====================================
  Files         ?      25           
  Lines         ?    1527           
  Branches      ?       0           
====================================
  Hits          ?    1306           
  Misses        ?     221           
  Partials      ?       0           
ejm714 commented 3 years ago

@pjbull ready for another look. changes here: https://github.com/drivendataorg/zamba/compare/5511389..b3c0ee4?expand=1

also adds a fix to look up just the hparams.yaml file if you're training an official model from scratch rather than loading the full checkpoint and deriving stuff from there. this is nice because it means checkpoint in train_configuration.yaml will be None if from_scratch: True -- this is not only good for clarity but allows you to reuse the train_configuration.yaml file as expected (since checkpoint supersedes model, having a checkpoint if you trained from scratch would have resulted in different behavior if you reused the written out train_configuration.yaml)