Closed lovettchris closed 1 year ago
Hi @lovettchris, thanks again for the thoughtful feedback!
ArchaiModel
is inside archai/discrete_search/api
to avoid compatibility issues with archai/supergraph
, but ideally it should be just under archai/api/
and used in both discrete search and supergraph libs. Regarding the second point, I've added most important API classes, evaluators and algorithms to __init__.py
, now the imports should be slightly shorterMyModel
was previously named DummyModel
. Thanks for spotting that3, 5, 6, 7. Fixed, thanks for the suggestions
Regarding point 4, I agree with you. I can give you some more context on some other API changes that may also affect this.
Describe the bug
First a question about api naming, why is "archai" used again in namespace
archai_model
, it seems redundant when we have the long import stringfrom archai.discrete_search.api.archai_model
already establishes we are in the archai package. Is the Archai API going to provide multiple model api's besides "ArchaiModel" ? Then just call it "Model". Perhaps you also should have an init.py that flattens this namespace a bit, for example, you could lift ArchaiModel out of the archai_model.py file so you can writefrom archai.discrete_search.api import ArchaiModel
.We can now wrap a
DummyModel
instance into anArchaiModel
: => Why is this aDummyModel
? I don't see that term used anywhere in the code that follows? Perhaps you could write "wrap a dummy model instance" if you are trying to explain that the resulting model somehow a "dummy"... why is it a dummy? Should ArchaiModel be renamed DummyModel? Perhaps yourclass MyModel
used to beclass DummyModel
because I saw the output of cell [11] just change when I executed it.What is the significance of the structure in the archid
'L=2, K=3, H=16'
? Is this always required, is this only user defined? How unique do these id's have to be, some explanation would be nice. You do say "Architecture ids are used to identify and deduplicate architectures during search" but some more would be nice, like "Architecture ids are used to identify a unique model architecture and the contents is up to you, the idea is they need to be able to identify unique architectures generated during the search process".Why isn't "save_weights" and "load_weights" on ArchaiModel instead of DiscreteSearchSpace? I mean I understand why, ArchaiModel knows nothing about my torch model actually. But it just seems clunky and non-object oriented, I think if I were to use this API I'd subclass ArchaiModel and move a lot of stuff there, including the saving and loading of weights and the saving and loading of the archid components, layers, kernel size and hidden dimensions, allowing the CNNSearchSpace to focus on what it does which is really just random_sample... I think that would look more clean anyway...
!cat 'arch.json
is not cross-platform, perhaps this isprint(open('arch.json').read())
crossover example, only blends 2 models, ignoring the rest of the list, is this a valid implementation? Wouldn't it be better to consider all the models in the list? The code is not really that much more complicated:
and test it with:
"used for multiple tasks" => I know archai uses the term "task" to mean a type of model (what the model does is the task) but this language will be unclear to new readers. How about "used for a set of commonly used model types" or something...
from archai.discrete_search.search_spaces.cv import SegmentationDagSearchSpace
much nicer, I see this has an init.py that is lifting the SegmentationDagSearchSpace out ofsegmentation_dag.search_space
.