Questions about the paper

minhquoc0712 commented 11 months ago

Thank you for your interesting research. I have some questions regarding the paper:

I'm curious about the adaptability of GHNs to other standard-sized datasets, particularly in different tasks such as image segmentation. The Penn-Fudan dataset discussed in your paper seems relatively small. Could you share your thoughts on this?
Do you remember the accuracy of the models in DeepNets-1M achieved during the training of GHNs model? How are they compared to the predicted parameter model accuracy with/without fine-tuning?

Thank you.

bknyaz commented 11 months ago

Thanks for your interest.

To clarify, in this work as well as in the previous GHN-2 work we only trained GHNs on image classification. To solve tasks such as object detection (Penn-Fudan) we just predicted the initial params of the object detection backbone and then trained the entire detection network as usual (GHNs are never updated on the these non image classification tasks). Therefore, it's hard to comment on the adaptability of GHNs to non image classification tasks. But implementation-wise it should be straightforward to train GHNs to predict params for other tasks, perhaps some extra task specific decoders might need to be added (e.g. decoder for bbox predictor weights in case of object detection). I think it's an interesting research direction.
In Table 12 of our paper we report the average top-1 accuracy on all 900 DeepNets-1M evaluation architectures: 27.3±12.3. I think the training accuracy was slightly higher by the end of training, around 29-30%. I will double check and get back to you. Table 12 also reports the results with/without fine-tuning on PyTorch architectures. Table 2 also compares with/without fine-tuning. Let me know in case of more questions.

bknyaz commented 11 months ago

Upon checking my training logs, the training accuracy of the best GHNs on ImageNet was only 20-21%. Sorry for overestimating it in the previous message.

SamsungSAILMontreal / ghn3

Questions about the paper #1