facebookresearch / ppuda

Code for Parameter Prediction for Unseen Deep Architectures (NeurIPS 2021)
MIT License
485 stars 60 forks source link

Training order of images and architectures #11

Closed minhquoc0712 closed 1 year ago

minhquoc0712 commented 1 year ago

Hi,

In the training code of GHN, the outer loop is the image batches and the inner loop is over the architectures with meta-batch.

Can you explain the motivation behind this order of loops? Have you experimented the other way around: loop over architectures and then image batches inside.

Thank you.

bknyaz commented 1 year ago

Hi,

In our implementation, for each training iteration we sample a batch of images and a batch of architectures simultaneously. So the inner and outer loops are called simultaneously.

The line for step, (images, targets) in enumerate(train_queue) can be replaced with images, targets = next(iter(train_queue)). Sampling architectures is done with graphs = next(graphs_queue). The order of these two steps should not matter.

See: https://github.com/facebookresearch/ppuda/blob/main/experiments/train_ghn.py#L116C1-L121C48

minhquoc0712 commented 1 year ago

Thank you. I understand now.