carpedm20 / ENAS-pytorch

PyTorch implementation of "Efficient Neural Architecture Search via Parameters Sharing"
Apache License 2.0
2.69k stars 492 forks source link

Errors When running #18

Closed axiniu closed 6 years ago

axiniu commented 6 years ago

@dukebw ,Hi,thanks for your work,when I run this code I meet some problems.

  1. When I run it using the run.sh by default ,I get THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1503965122592/work/torch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory Traceback (most recent call last): File "main.py", line 48, in main(args) File "main.py", line 30, in main trnr = trainer.Trainer(args, dataset) File "/home/axi/ENAS-pytorch-master-3/trainer.py", line 160, in init self.build_model() File "/home/axi/ENAS-pytorch-master-3/trainer.py", line 192, in build_model self.shared.cuda() File "/home/axi/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 147, in cuda return self._apply(lambda t: t.cuda(device_id)) File "/home/axi/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 118, in _apply module._apply(fn) File "/home/axi/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 118, in _apply module._apply(fn) File "/home/axi/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 124, in _apply param.data = fn(param.data) File "/home/axi/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 147, in return self._apply(lambda t: t.cuda(device_id)) File "/home/axi/anaconda3/lib/python3.6/site-packages/torch/_utils.py", line 66, in _cuda return newtype(self.size()).copy(self, async) RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1503965122592/work/torch/lib/THC/generic/THCStorage.cu:66

While I have 3 GPUS,10 G memory.

  1. When I run it using : python main.py --network_type cnn --dataset cifar --controller_optim momentum --controller_lr_cosine=True --controller_lr_max 0.05 --controller_lr_min 0.0001 --entropy_coeff 0.1,I get: 2018-04-29 19:01:57,957:INFO::[] Make directories : logs/cifar_2018-04-29_19-01-57 Traceback (most recent call last): File "main.py", line 48, in main(args) File "main.py", line 26, in main dataset = data.image.Image(args.data_path) File "/home/axi/ENAS-pytorch-master-2/data/image.py", line 8, in init if args.datset == 'cifar10': AttributeError: 'str' object has no attribute 'datset' and after I make some changes,I get other errors such as: 2018-04-29 18:49:24,745:INFO::[] Make directories : logs/cifar10_2018-04-29_18-49-24 Files already downloaded and verified 2018-04-29 18:49:27,464:INFO::regularizing: Traceback (most recent call last): File "main.py", line 48, in main(args) File "main.py", line 30, in main trnr = trainer.Trainer(args, dataset) File "/home/axi/ENAS-pytorch-master-1/trainer.py", line 139, in init self.cuda) File "/home/axi/ENAS-pytorch-master-1/utils.py", line 148, in batchify data = data.narrow(0, 0, nbatch bsz) AttributeError: 'DataLoader' object has no attribute 'narrow' or 2018-04-29 18:22:50,192:INFO::[] Make directories : logs/cifar10_2018-04-29_18-22-50 Files already downloaded and verified 2018-04-29 18:22:55,041:INFO::regularizing: Traceback (most recent call last): File "main.py", line 48, in main(args) File "main.py", line 30, in main trnr = trainer.Trainer(args, dataset) File "/home/axi/ENAS-pytorch-master-1/trainer.py", line 139, in init self.cuda) File "/home/axi/ENAS-pytorch-master-1/utils.py", line 147, in batchify nbatch = data.size // bsz AttributeError: 'DataLoader' object has no attribute 'size'

Would you please tell me what changes I should make before I run the code.Thanks for you response.

SimLemay commented 5 years ago

I have the same problem. Did you solve it?

kukby commented 4 years ago

@axiniu I have the same problem .Did you solve it?