automl / NASLib

NASLib is a Neural Architecture Search (NAS) library for facilitating NAS research for the community by providing interfaces to several state-of-the-art NAS search spaces and optimizers.
Apache License 2.0
528 stars 117 forks source link

Encodings #143

Closed gurizab closed 1 year ago

gurizab commented 2 years ago

Created separate encodings.py files in a search space specific manner. The file NASLib/naslib/predictors/utils/encodings.py is redundant now. Changes to the full codebase to integrate the new changes still need to be done.

Neonkraft commented 2 years ago

I see there are a lot of different conversions. To make it easier for the user, it might make sense to unify all conversion code under a single API as follows:

def encode(from: Union[EncodingType, Graph], to: Union[EncodingType, Graph], encoding: Union[Encoding, Graph])

Where EncodingType can be an Enum which has all the possible encoding types (such as adjacency one-hot encoding, or path encoding). This way, the only API which the user needs to be familiar with is naslib.search_spaces.xyz.encode. Enums would also make it easier to code since most IDEs will offer auto-complete on it. Alternatively, EncodingType can also have Graph as an option, indicating that this is the NASLib Graph object. We can be flexible with this design decision.

You can also consume this API in all the graph.py.

gurizab commented 1 year ago

@Neonkraft I created the EncodingType Enum Class under naslib/utils/encodings.py which contains all the possible encoding types currently supported in NASLib.

Also, every search space now has an encodings.py file with their respective possible encodings. Now, the usage of the encodings for a graph g is g.encode(encoding_type=EncodingType.ENCODING) like e.g. here.

Since, every search space now has an encode function I think it is not necessary to give a from and to parameter in the encode function.

As for encode_spec, I still need to test it because I am not able to run xgb_runner.py(which instantiates ZCEnsembleEvaluator which is the only predictor that uses encode_spec.) due to some AttributeErrors Let me know what you think.

Neonkraft commented 1 year ago

@gurizab, thanks for the PR! I've made a few minor fixes, but other than that, it looks good to me :)

@abhash-er and @jr2021 , if you could look into the problem with xgb_runner.py and fix it, we can merge this PR. Thanks!

abhash-er commented 1 year ago

The problem is with the default config file that the runner takes. It does not have the attributes like train_size and zc_names. Changing the default config for the zero-cost case should fix that problem. The zc_config.yaml file should change to

batch_size: 32
config_type: zc_example
cutout: false
cutout_length: 16
cutout_prob: 1.0
dataset: cifar10
out_dir: run
predictor: fisher
search_space: nasbench101 #nasbench201 #nasbench301
seed: 0
test_size: 200
train_size: 400 
zc_ensemble: true
zc_only: true
zc_names: 
  - params
  - flops 
  - jacov 
  - plain 
  - grasp
  - snip 
  - fisher
  - grad_norm 
  - epe_nas 
  - synflow 
  - l2_norm
optimizer: npenas
train_portion: 0.7

We also needed to change the encoding_type from string to ENUM(s) in zc_ensemble_evaluator to get the results from xgb_runner.py. We recommend searching for the encoding type strings - adjacency_one_hot, adjacency_mix, path, gcn, bonas, seminas, and compact - everywhere in the codebase and change them to their respective ENUM type.

Neonkraft commented 1 year ago

@abhash-er, thanks a lot for looking into this. @gurizab , if you could please address these comments, we can merge and close this PR.

PS: We should ideally be merging to the Develop branch from now on. So either raise this PR again with a merge to Develop, or merge Develop_copy to Develop once this PR is merged. The former is a lot cleaner.

Thanks!