Group 1 - Porting Zero-Cost NAS Proxies to MASE

This pull request outlines how the Zero-Cost proxy for NAS search was integrated into the chop workflow. The results found in the accompanying report can be replicated using the zero_cost_report.ipynb notebook found in mase/machop/chop/actions/search/search_space/zero_cost_proxy/zero_cost_report.ipynb

Functionality

Basic Elements

Implement a new search space that allows an integration of zero-cost proxies with existing Bayesian based search algorithms.
Evaluate the search on a relatively small search space with a small dataset (CIFAR10) on vision models.
Support utilising an ensemble of zero-cost proxies (concurrently use a group of them), evaluate this performance against single proxies.

Extensions

Explore using a larger dataset (CIFAR-100) to search for architectures, evaluate this performance against single proxies and other ensembles. 
Explore using another image dataset (ImageNet16-120).
Created an ensemble of proxies using a simple linear neural network, evaluate this performance against single proxies and other ensembles. 
Create an ensemble of proxies using a non-linear neural network.
Create an ensemble of proxies using XGBoost.

Implementation Details

TOML Configuration File

The search functionality to do zero cost experimentation is specified in the TOML configuration file. Instead of developing a command line feature, this method is chosen because of the numerous configuration parameters requiring adjustment. Consolidating them into one refined configuration file improves the ability to reproduce results.
The ‘search_space’ configuration is set in search.search_space and the search strategy is configured via search.strategy.

search_space

This part of the TOML sets the search space options.
This is where the user decides which benchmark to use as well as which dataset to use.
The user is also able to choose whether to use a linear or non-linear neural network for the ensemble model, as well as choosing the parameters such as loss function, optimizer, batch size, learning rate, and epochs for this model. Additionally, the number of training and test architectures are chosen. Finally, the specific zero cost proxies the user wants to test are added as a list.

 [search.search_space.zc] 

seed = 2 

benchmark = 'nasbench201' 

dataset = 'cifar10' 

calculate_proxy = false 

ensemble_model = 'nonlinear' 

loss_fn = 'mae' 

optimizer = 'adam' 

batch_size = 4 

learning_rate = 0.02 

epochs = 30 

num_archs_train = 2000 

num_archs_test = 2000 

zc_proxies = ['epe_nas', 'fisher', 'grad_norm', 'grasp', 'jacov', 'l2_norm', 'nwot', 'plain', 'snip', 'synflow', 'zen', 'flops', 'params']

strategy

The user also has the ability to choose certain parameters for the Optuna strategy.
This includes the number of jobs and trials they want to run the study for as well as which sampler to use. Additionally, the user can choose the lower and upper limits of allowable weights given to each ZC proxy with the weight_lower_limit and weight_upper_limit variables, respectively.


[search.strategy.setup] 

n_jobs = 4 

n_trials = 100 

timeout = 20000 

sampler = "tpe" 

direction = "minimize" 

weight_lower_limit = 0 

weight_upper_limit = 30

Graph.py

A folder called zero_cost_proxy is created under the search_space directory. In this folder, a graph.py file is created which contains a ZeroCostProxy class, which in turn inherits from the SearchSpaceBase base class (base.py class).
From the TOML configuration file, the user has chosen which benchmark to use (NASBench 201) and which dataset to use (CIFAR-10/CIFAR-100/ ImageNet16-120) as well as the size of the training and test set. The graph.py file then takes these settings and, using NASlib, retrieves the required architecture hashes by randomly sampling from the appropriate benchmark and dataset.
It has 3 main functionalities:

Calculates the individual zero cost proxies metrics and their Spearman/ and Kensal Tau scores: For every chosen proxy, the calculate_zc function calculates the zero cost score for every architecture. It then calculates the KendallTau and Spearman correlation scores using evaluate_predictions.
Creates an ensemble using a neural network (either linear or non-linear, as designed by the user): This involves the training of the network and the testing of the network. train_zc_ensemble_model is responsible for model training, using the parameters chosen in the TOML file (linear/non linear mode, loss function, optimizer, learning rate, number of epochs), this function will train the ensemble model. Additionally, to reduce overfitting, the best model during the training will be saved. After training a model, the test_zc_ensemble_model function is responsible for testing the model and calculating the KendallTau and Spearman correlation scores.
Creates an ensemble using XGBoost: This involves training an XGBoost regressor with default hyperparameters, and then using this trained model to predict the test accuracies of the sampled architectures. Again, the KendallTau and Spearman correlation scores are evaluated

zero_cost.py

In the strategies folder, a zero_cost.py file is created. This file consists of a SearchStrategyZeroCost class which inherits from the SearchStrategyBase class.
Functionality: This file is responsible for creating the Optuna strategy, and the main functionality occurs in the search function. In this function, the Optuna study is created, which consists of using a sampler chosen by the user in the TOML file. The objective of the Optuna study, found in the objective function, is to find weightings for the zero cost proxies where the target to be optimized is the test accuracy of the sampled architecture. The lower and upper weighting threshold, number of trials, and timeout is chosen by the user in the TOML file. To ensure the predictions are within the 0-100 range, a penalty is applied to predictions outside this range. After completing the study, the weightings for each proxy are combined into one weighting, which makes the ensemble. The study is then saved in a pickle file. Additionally, details of the study are saved in a file called log.json.
In the _save_best_zero_cost function, using the ensemble weightings calculated by the Optuna study, the Kendall Tau and Spearman correlation scores are evaluated. Moreover, the results of the individual proxies, as well as all the ensemble methods (Optuna, ensemble model, and XGBoost) are combined and saved in a file called metrics.json. This file contains the metrics for each tested sampled architecture. Additionally, the top five best performing metrics, for both Kendall Tau and Spearman, are saved in a table and output to the user. This will inform the user which proxies or ensemble of proxies performed the best.

How to Run

To execute the search from the command line, follow these steps:

This PR makes use of a submodule, there fore it must be initialised with;
```
git submodule init NASLib 
```
It must then be updated with;
```
git submodule update NASLib
```
Navigate to NASLib and install all dependencies;
```
pip install -e . 
```
Navigate to the machop directory.
Run the following command, specifying the path to the TOML configuration file:

 ./ch search --config configs/examples/zero_cost_proxy.toml

Where a search space configuration TOML file must be pointed to, after the ‘—config’ flag.

If prompted, put the data into NASLib/naslib/data

Output

After completing a search, the following is output on the command line;

|    | spearman                          | kendaltau                         | Global Parameters            |
|----+-----------------------------------+-----------------------------------+------------------------------|
|  0 | {'xgboost': 0.928}                | {'xgboost': 0.789}                | {'num_training_archs': 2000} |
|  1 | {'nonlinear': 0.82}               | {'nonlinear': 0.632}              | {'num_testing_archs': 2000}  |
|  2 | {'optuna_ensemble_metric': 0.779} | {'optuna_ensemble_metric': 0.594} | {'dataset': 'cifar10'}       |
|  3 | {'synflow': 0.774}                | {'synflow': 0.581}                | {'benchmark': 'nasbench201'} |
|  4 | {'nwot': 0.756}                   | {'nwot': 0.57}                    | {'num_zc_proxies': 13}       |

This shows the top five best performing ZC proxies/ensemble based on both the Spearman and Kendall Tau correlations as well as important settings set in the TOML file including the number of training and testing architectures, the dataset used, benchmark used and number of ZC proxies evaluated.

Checklist

[x] Code implementation for integrating zero-cost proxies into the CHOP workflow.
[x] Detailed documentation outlining functionality and implementation details.
[x] Appropriate documentation added to Sphinx.
[x] Testing conducted to ensure functionality and performance.
[x] Necessary files and changes included for seamless integration.

DeepWok / mase