Open ouyangzhuzhu opened 5 years ago
Hi @ouyangzhuzhu,
Thanks for your question. The h5 file should be generated in the same folder as your model file. With that command, there should be two .h5
files in the folder cifar10/trained_nets/resnet56_sgd_lr=0.1_bs=128_wd=0.0005/
:
model_300.t7_weights_xignore=biasbn_xnorm=filter_yignore=biasbn_ynorm=filter.h5
is the direction file which saves the directions, model_300.t7_weights_xignore=biasbn_xnorm=filter_yignore=biasbn_ynorm=filter.h5_[-1.0,1.0,51]x[-1.0,1.0,51].h5
is the surface file which contains the surface values respect to that direction and resolution.
We have provided our precomputed files. So if you want to generate your own result file, you can delete them or simply use a different resolution.
great 3ks @ljk628 ! Yes after 4 hours I got the final h5 files just like u said!~
But I got a error at the end, can u help see it:
Evaluating rank 0 2600/2601 (100.0%) coord=[1. 1.] train_loss= 17.668 train_acc=8.31 time=5.66 sy nc=0.00 Rank 0 done! Total time: 14505.95 Sync: 2.20 Traceback (most recent call last): File "plot_surface.py", line 298, in <module> plot_2D.plot_2d_contour(surf_file, 'train_loss', args.vmin, args.vmax, args.vlevel, args.show) File "/home/l00221575/Downloads/loss-landscape/plot_2D.py", line 18, in plot_2d_contour f = h5py.File(surf_file, 'r') File "/home/l00221575/venv_openai-es/lib/python3.5/site-packages/h5py/_hl/files.py", line 394, in __init__ swmr=swmr) File "/home/l00221575/venv_openai-es/lib/python3.5/site-packages/h5py/_hl/files.py", line 170, in make_fid fid = h5f.open(name, flags, fapl=fapl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5f.pyx", line 85, in h5py.h5f.open OSError: Unable to open file (unable to lock file, errno = 11, error message = 'Resource temporarily unavailable')
And I try to use the comman below to produce and customize a contour plot using the script plot_2D.py:
python plot_2D.py --surf_file path_to_surf_file --surf_name train_loss
I failed too :( :
(venv_openai-es) l00221575@F0817-S05:~/Downloads/loss-landscape$ python plot_2D.py --surf_file cifar10/trained_nets/resnet56_sgd_lr\=0.1_b s\=128_wd\=0.0005/ --surf_name train_loss Traceback (most recent call last): File "plot_2D.py", line 205, in <module> plot_2d_contour(args.surf_file, args.surf_name, args.vmin, args.vmax, args.vlevel, args.show) File "plot_2D.py", line 18, in plot_2d_contour f = h5py.File(surf_file, 'r') File "/home/l00221575/venv_openai-es/lib/python3.5/site-packages/h5py/_hl/files.py", line 394, in __init__ swmr=swmr) File "/home/l00221575/venv_openai-es/lib/python3.5/site-packages/h5py/_hl/files.py", line 170, in make_fid fid = h5f.open(name, flags, fapl=fapl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5f.pyx", line 85, in h5py.h5f.open OSError: Unable to open file (file read failed: time = Thu Jan 10 02:44:57 2019 , filename = 'cifar10/trained_nets/resnet56_sgd_lr=0.1_bs=128_wd=0.0005/', file descriptor = 4, errno = 21, error message = 'Is a director y', buf = 0x7ffdce19ddb0, total read size = 8, bytes this sub-read = 8, bytes actually read = 18446744073709551615, offset = 0)
This is the same as https://github.com/tomgoldstein/loss-landscape/issues/4, which can be temporally solved by downgrading the h5py pip install h5py==2.7.0
.
great great great 3ks!!!!! it worked!!!!
(venv_openai-es) l00221575@F0817-S05:~/Downloads/loss-landscape$ python plot_surface.py --mpi --cuda --model resnet56 --x=-1:1:51 --y=-1:1:51 --model_file cifar10/trained_nets/resnet56_noshort_sgd_lr\=0.1_bs\=128_wd\=0.0005/model_300.t7 /home/l00221575/venv_openai-es/lib/python3.5/site-packages/h5py/__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from
floatto
np.floatingis deprecated. In future, it will be treated as
np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_convertersRank 0 use GPU 0 of 8 GPUs on F0817-S05
Traceback (most recent call last):
File "plot_surface.py", line 243, in
HI: friends! I have installed all the tools the README.md mentioned and download the ResNet-56 (10 MB) and run this command below:
mpirun -n 4 python plot_surface.py --mpi --cuda --model resnet56 --x=-1:1:51 --y=-1:1:51 \ --model_file cifar10/trained_nets/resnet56_sgd_lr=0.1_bs=128_wd=0.0005/model_300.t7 \ --dir_type weights --xnorm filter --xignore biasbn --ynorm filter --yignore biasbn --plot
But 24 hoursd later, nothing changed , i cann't finf .h5 file created. Where can i found the .h5 file or did I miss something? Hope u can help~~ 3ks