microsoft / archai

Accelerate your Neural Architecture Search (NAS) through fast, reproducible and modular research.
https://microsoft.github.io/archai
MIT License
467 stars 92 forks source link

[BUG] typos and feedback on getting_started/notebooks/discrete_search/search_space.ipynb #162

Closed lovettchris closed 1 year ago

lovettchris commented 1 year ago

Describe the bug

  1. First a question about api naming, why is "archai" used again in namespace archai_model, it seems redundant when we have the long import string from archai.discrete_search.api.archai_model already establishes we are in the archai package. Is the Archai API going to provide multiple model api's besides "ArchaiModel" ? Then just call it "Model". Perhaps you also should have an init.py that flattens this namespace a bit, for example, you could lift ArchaiModel out of the archai_model.py file so you can write from archai.discrete_search.api import ArchaiModel .

  2. We can now wrap a DummyModel instance into an ArchaiModel: => Why is this a DummyModel ? I don't see that term used anywhere in the code that follows? Perhaps you could write "wrap a dummy model instance" if you are trying to explain that the resulting model somehow a "dummy"... why is it a dummy? Should ArchaiModel be renamed DummyModel? Perhaps your class MyModel used to be class DummyModel because I saw the output of cell [11] just change when I executed it.

  3. What is the significance of the structure in the archid 'L=2, K=3, H=16' ? Is this always required, is this only user defined? How unique do these id's have to be, some explanation would be nice. You do say "Architecture ids are used to identify and deduplicate architectures during search" but some more would be nice, like "Architecture ids are used to identify a unique model architecture and the contents is up to you, the idea is they need to be able to identify unique architectures generated during the search process".

  4. Why isn't "save_weights" and "load_weights" on ArchaiModel instead of DiscreteSearchSpace? I mean I understand why, ArchaiModel knows nothing about my torch model actually. But it just seems clunky and non-object oriented, I think if I were to use this API I'd subclass ArchaiModel and move a lot of stuff there, including the saving and loading of weights and the saving and loading of the archid components, layers, kernel size and hidden dimensions, allowing the CNNSearchSpace to focus on what it does which is really just random_sample... I think that would look more clean anyway...

  5. !cat 'arch.json is not cross-platform, perhaps this is print(open('arch.json').read())

  6. crossover example, only blends 2 models, ignoring the rest of the list, is this a valid implementation? Wouldn't it be better to consider all the models in the list? The code is not really that much more complicated:

      @overrides
      def crossover(self, model_list: List[ArchaiModel]) -> ArchaiModel:
    
          new_config = {
              'nb_layers': self.rng.choice([m.arch.nb_layers for m in model_list]),
              'kernel_size': self.rng.choice([m.arch.kernel_size for m in model_list]),
              'hidden_dim': self.rng.choice([m.arch.hidden_dim for m in model_list]),
          }
    
          crossover_model = MyModel(**new_config)
    
          return ArchaiModel(
              arch=crossover_model, archid=self.get_archid(crossover_model)
          )

    and test it with:

    models = [ss.random_sample() for _ in range(10)]
    [print(m.archid) for m in models]
    ss.crossover(models).archid
  7. "used for multiple tasks" => I know archai uses the term "task" to mean a type of model (what the model does is the task) but this language will be unclear to new readers. How about "used for a set of commonly used model types" or something...

  8. from archai.discrete_search.search_spaces.cv import SegmentationDagSearchSpace much nicer, I see this has an init.py that is lifting the SegmentationDagSearchSpace out of segmentation_dag.search_space.

piero2c commented 1 year ago

Hi @lovettchris, thanks again for the thoughtful feedback!

  1. I totally agree. Right now, ArchaiModel is inside archai/discrete_search/api to avoid compatibility issues with archai/supergraph, but ideally it should be just under archai/api/ and used in both discrete search and supergraph libs. Regarding the second point, I've added most important API classes, evaluators and algorithms to __init__.py, now the imports should be slightly shorter
  2. MyModel was previously named DummyModel. Thanks for spotting that

3, 5, 6, 7. Fixed, thanks for the suggestions

Regarding point 4, I agree with you. I can give you some more context on some other API changes that may also affect this.