Open carpedm20 opened 6 years ago
Sure! I'll take some rest for now so any help would be appreciated. Yes, I guess they used padding to make dimension consistent like:
pool = nn.MaxPool2d(kernel_size=3, stride=1, padding=1)
Great -- I'll fork and do some work over the weekend. By the way -- how long does the RNN experiment run for, and what's the final PPL you're getting? Is it similar to what they report in the paper?
40 epochs (train PPL=56) took 6 hours with gpu980 and will take 22 hours for 150 epoch. I didn't reach the end yet and I think the scale of reward and loss might need some changes.
I've implemented some of the micro-CNN search space, though in a different project that's not totally compatible with this one. I'm going to clean it up over the next couple of days and I'll post a link here when it'd be reasonable for other people to take a look at it.
I'm currently having trouble reproducing the results from the paper -- the ENAS CNN training seems very unstable. I need to do some further experiments to understand how weight sharing affects the convergence of the individual architectures.
@bkj Did you manage to reproduce the results? I too implemented from scratch but am getting around 82% accuracy.
No -- I have not been able to reproduce the results. I moved from using a RL controller to something simpler (random search, basically) and have trained models w/ ENAS-style parameter sharing to 92% test accuracy, while my baseline preactivation ResNet18 gets > 93% when trained w/ the same settings.
~ Ben
Thanks! I am doing the same but getting ~82%. Have you open-sourced your code (or can you please share your code)?
Yes it's here -- https://github.com/bkj/ripenet
No documentation yet, open an issue if you have questions.
~ Ben
@carpedm20 @bkj @karandwivedi42 @dukebw ,Hi,Can you run this code successfilly? When I run it by : python main.py --network_type cnn --dataset cifar10 --controller_optim momentum --controller_lr_cosine=True --controller_lr_max 0.05 --controller_lr_min 0.0001 --entropy_coeff 0.1,I met some errors. What I want to do is find cnn arvhitectures and make them visualized. Would you please tell me what changes Ishould do to the code before I run it. Thanks for your reply.
Nope. I couldn't run this. However, the authors have released their official code (link to it is in the README on this repo). ᐧ
On Mon, Apr 30, 2018 at 4:48 AM axiniu notifications@github.com wrote:
@carpedm20 https://github.com/carpedm20 @bkj https://github.com/bkj @karandwivedi42 https://github.com/karandwivedi42 @dukebw https://github.com/dukebw ,Hi,Can you run this code successfilly? When I run it by : python main.py --network_type cnn --dataset cifar10 --controller_optim momentum --controller_lr_cosine=True --controller_lr_max 0.05 --controller_lr_min 0.0001 --entropy_coeff 0.1,I met some errors. What I want to do is find cnn arvhitectures and make them visualized. Would you please tell me what changes Ishould do to the code before I run it. Thanks for your reply.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/carpedm20/ENAS-pytorch/issues/1#issuecomment-385342276, or mute the thread https://github.com/notifications/unsubscribe-auth/AJLb6joiApp1P8jV-MvJoh_TKc0y_T6Vks5tts_0gaJpZM4SGWBf .
@karandwivedi42 ,Thank you ,the code linked in the README on this repo I have run successfully.But now what I want to do is make the CNN architectures searched visualized.
I've been working on an implementation of the CNN portion of this paper, and I may be able to help w/ the CNN model and cell searches if you're interested.
One issue I don't think they address in the paper is how they're handling spatial dimensions -- do you have any thoughts on this? I'm guessing they pad s.t. the input and output of each layer is the same?