ASUS-AICS / LibMultiLabel-Old-Archive

This library has been moved to https://github.com/ntumlgroup/LibMultiLabel
MIT License
153 stars 29 forks source link

Grid Search Improvement: Saving Time (& Money, if Applicable) by Reducing Model Size During Grid Search #370

Open donglihe-hub opened 6 months ago

donglihe-hub commented 6 months ago

What does this PR do?

As model size grows, it takes more time to save a model to the disk. The problem is significant during grid search, where thousands of large models could be saved during the process, leading to spending more time (& money, if applicable) on grid search.

For example, let's say it takes 20 seconds to save a mode. When training a model with 50 epochs and 80 trials (a trial means a whole training process), it will take 22 (20 / 60 / 60 50 80) hours for saving models on a single GPU, which is a waste of time (& money, if applicable).

This PR reduces the model size to 1/3 by removing optimizer states during grid search.

Test CLI & API (bash tests/autotest.sh)

Test APIs used by main.py.

Check API Document

If any new APIs are added, please check if the description of the APIs is added to API document.

Test quickstart & API (bash tests/docs/test_changed_document.sh)

If any APIs in quickstarts or tutorials are modified, please run this test to check if the current examples can run correctly after the modified APIs are released.