ChangYong-Oh / HyperSphere

Other
16 stars 7 forks source link

Using other test functions #3

Open aminnayebi opened 5 years ago

aminnayebi commented 5 years ago

Hi

I have a question regarding other benchmarks in your code. I see that beside the synthetic functions, there are some other functions such as mnist, depth resNet, and cifar10. My question is that can I use these functions in my own code to evaluate my own algorithm? If yes how can I call them? Should I simply call, for example, output=mnist_weight(input)?

Another question is that what is the search space for the inputs of these benchmarks? Is it [-1,1]^D?

Regards, Amin

ChangYong-Oh commented 5 years ago

The search space should be transformed to hypercube [-1, 1]^D.

For mnist, you can call it as you described. For SDResnet you can call it in the same way, but you need to set other SDResnet code which is forked to https://github.com/ChangYong-Oh/img_classification_pk_pytorch

aminnayebi commented 5 years ago

Thanks. What do you mean by "setting that forked code"? Should I put that code in the directory of test functions?

My other question is that what are the deferences between cifar10_weight and stochastic_depth_resnet test functions? Are they similar to each other?

ChangYong-Oh commented 5 years ago

In stochastic depth resnet test, you can see variable cmd_str, which means that it calls some console command which uses some external SD-resnet training code. The variable 'stochastic_depth_dir' should correctly specify SD-resnet training code, which is in the repo of provided link above.

Cifar10_weight is just another version of mnist_weight.

aminnayebi commented 5 years ago

To run Cifar10_weight, I downloaded the code that you provided and put them in a directory that matches 'stochastic_depth_dir' variable (The code is in 'img_classification_pk_pytorch' folder in my home derctory). But still when I want to run it I get this error:

Traceback (most recent call last): File "HyperSphere/HyperSphere/test_functions/cifar10_weight.py", line 98, in print('Return value : ', _stochastic_depth_resnet(torch.rand(54), 'cifar100+')) File "HyperSphere/HyperSphere/test_functions/cifar10_weight.py", line 67, in _stochastic_depth_resnet probability_file = open(probability_filename, 'w') IOError: [Errno 2] No such file or directory: '/home/u22/aminnayebi/img_classification_pk_pytorch/save/stochastic_depth_death_rate_cifar100+_20190104-21:46:38:494967.pkl'

Thank you.

ChangYong-Oh commented 5 years ago

cifar10_weight is not complete code. I don't recommend to run that code and that was not used for the paper. Anyways in the error message, it says there is no file or directory, which means that your path setting is incorrect or you don't have such file

aminnayebi commented 5 years ago

Yes, there is no such file. Actually, it seems that your code is calling a pickle file which does not exist. How about the 'stochastic_depth_resnet' test function. Is that complete? Because when I want to run that file, I am still getting the same above error.

ChangYong-Oh commented 5 years ago

the line

probability_file = open(probability_filename, 'w')

is for writing to new file so you don't need such file to make that line work. you can check that you have such directory

aminnayebi commented 5 years ago

Ok, so do you mean that I comment the lines 65 to 70, and ignore them?

ChangYong-Oh commented 5 years ago

No. If you comment out it, the code wouldn't work

Do you have the directory /home/u22/aminnayebi/img_classification_pk_pytorch/save/ ??

aminnayebi commented 5 years ago

No I do not have it. Do I need to make it?

ChangYong-Oh commented 5 years ago

Yes.

aminnayebi commented 5 years ago

Thanks. It worked. Now I am getting another error

Traceback (most recent call last): File "HyperSphere/HyperSphere/test_functions/stochastic_depth_resnet.py", line 97, in print('Return value : ', _stochastic_depth_resnet(torch.rand(54), 'cifar100+')) File "HyperSphere/HyperSphere/test_functions/stochastic_depth_resnet.py", line 68, in _stochastic_depth_resnet probability_list = transform_with_center(probability_tensor, 0.5) File "HyperSphere/HyperSphere/test_functions/stochastic_depth_resnet.py", line 31, in transform_with_center center_probability = x.data.clone() * 0 + center_probability File "/home/u22/aminnayebi/.local/lib/python2.7/site-packages/torch/tensor.py", line 407, in data raise RuntimeError('cannot call .data on a torch.Tensor: did you intend to use autograd.Variable?') RuntimeError: cannot call .data on a torch.Tensor: did you intend to use autograd.Variable?

It seems that the variable X does not have any callable .data attribute.

ChangYong-Oh commented 5 years ago

I am not so sure about this error. It seems that your diagnosis is correct. You can just do x.clone()

aminnayebi commented 5 years ago

Thanks, it worked. I want to mention something in your code. In line 72 of 'stochastic_depth_resnet.py', when the device does not have a GPU, GPUtil.getAvailable returns []. In this case, GPUtil.getAvailable()[0] does not exist and your code gives an error. I fixed it in my system by just using gpu_device=str(0) instead.

aminnayebi commented 5 years ago

I have another general question. It seems that both codes of "cifar10_weight.py" and "stochastic_depth_resnet.py" are the same. Is it correct? Are you saying that both of these codes (which seem are the same) are not complete yet and I cannot use them? Since I faced another error, first I want to know that is it worth that I keep going and try to fix them?

ChangYong-Oh commented 5 years ago

cifar10_weight.py is not complete code. stochastic_depth_resnet.py was used to make the number in the paper. Using stochastic_depth_resnet.py should be fine. But the info about stochastic_depth_resnet.py is not complete yet. you can ask me questions about that.

aminnayebi commented 5 years ago

Thanks. Let me first ask a question about mnist_weight. I am running your code with test function "mnist_weight" on a GPU node. The code starts sampling 2 points but it is stuck in "Acquisition function optimization with 2 inits" step for 2 hours and does not proceed. Is it natural, or I am doing something wrong?

ChangYong-Oh commented 5 years ago

It shouldn't take that long. evaluation of mnist_weight shows progressbar

aminnayebi commented 5 years ago

Could you please take a look at this screenshot. It might show something that I am doing wrong?

image

After 2 hours, it shows no progress in the algorithm and the log files folder is empty. I tested the algorithm for Branin function, and it worked well. But when I am using the mnist_weight function, it is stuck.

ChangYong-Oh commented 5 years ago

This looks OK. It seems like acquisition function optimization takes that long, which is a bit weird.

aminnayebi commented 5 years ago

Hi,

I am using the mnist_weight function, and I am facing an error. The error is:

Traceback (most recent call last): File "HyperSphere/HyperSphere/BO/run_BO.py", line 248, in print(BO(**vars(args))) File "HyperSphere/HyperSphere/BO/run_BO.py", line 172, in BO output = torch.cat([output, func(x_input[-1]).resize(1, 1)]) AttributeError: 'torch.FloatTensor' object has no attribute 'resize'

It seems that the output of the mnist_weight function is different from what run_BO.py expects. How can I fix this problem?

ChangYong-Oh commented 5 years ago

Can you try resize_ ?? with underscore

aminnayebi commented 5 years ago

Ok, I tried that. For mnist_weight test function, I get this error:

output = torch.cat([output, func(xinput[-1]).resize(1, 1)]) RuntimeError: expected Variable as element 1 in argument 0, but got list

And for other test fucntions such as Branin, I get this error:

output = torch.cat([output, func(xinput[-1]).resize(1, 1)]) AttributeError: 'Variable' object has no attribute 'resize_'

ChangYong-Oh commented 5 years ago

how about .view(1, 1)?

aminnayebi commented 5 years ago

It works for Branin function, but the error for mnist_weight is still the same.

output = torch.cat([output, func(xinput[-1]).resize(1, 1)]) RuntimeError: expected Variable as element 1 in argument 0, but got list

ChangYong-Oh commented 5 years ago

can you print what is the output of func(x_input[-1])?

aminnayebi commented 5 years ago

Here it is the output of the func(x_input[-1])

0.3951 [torch.FloatTensor of size 1x1]

ChangYong-Oh commented 5 years ago

func(x_input[-1]).view(1, 1)

is not working?

aminnayebi commented 5 years ago

No, it does not work.

output = torch.cat([output, func(x_input[-1]).view(1, 1)]) RuntimeError: expected Variable as element 1 in argument 0, but got list

ChangYong-Oh commented 5 years ago

after chekcing output is Variable then please convert it to Variable

aminnayebi commented 5 years ago

I am a little bit confused. Do I have to check the "output" is a variable or "func(x_input[-1])". My next question is that how can I check that something is a Variable? and how can I convert it? Thanks.

ChangYong-Oh commented 5 years ago

From the error message, it seems like the type of output and the type of func(x_inputs[-1]) mismatches. So we should make them the same, so first we need to check type of output so that we can change the type of func(......) to the output's type. To change it to Variable just call Variable(func(x_inputs[-1])) from Variable x -> tensor, call x.data

aminnayebi commented 5 years ago

Thanks, I could fix it. I used these lines instead of line 172 in run_BO.py file. Those might be useful for others, so I write it here.

new_output=func(x_input[-1])
if not type(new_output)==Variable:
    new_output=Variable(new_output)
output = torch.cat([output, new_output.view(1, 1)])
aminnayebi commented 5 years ago

I am trying to use the stochastic_resnet test function. I did some corrections based on the errors I got, but I could not fix this error. Could you please help me?

====================COMMAND====================
cd /home/u22/aminnayebi/img_classification_pk_pytorch;CUDA_VISIBLE_DEVICES=0 python main.py --data cifar100+  --normalized --save /home/u22/aminnayebi/img_classification_pk_pytorch/save/cifar100+_20190128-02:39:05:153965 --death-mode chosen --death-rate-filename /home/u22/aminnayebi/img_classification_pk_pytorch/save/stochastic_depth_death_rate_cifar100+_20190128-02:39:05:153965.pkl --decay_rate 0.1 --decay_epoch_ratio 0.5 --learning-rate 0.01 --epoch 250
=> creating model 'resnet'
Create ResNet-56 for cifar100+
Traceback (most recent call last):
  File "main.py", line 200, in <module>
    validation_error = main()
  File "main.py", line 87, in main
    model = getModel(**vars(args))
  File "main.py", line 32, in getModel
    model = m.createModel(**kargs)
  File "/home/u22/aminnayebi/img_classification_pk_pytorch/models/resnet.py", line 163, in createModel
    assert len(death_rates) == nblocks
AssertionError
Traceback (most recent call last):
  File "stochastic_depth_resnet.py", line 99, in <module>
    print('Return value : ', _stochastic_depth_resnet(torch.rand(54), 'cifar100+'))
  File "stochastic_depth_resnet.py", line 95, in _stochastic_depth_resnet
    return torch.FloatTensor([[float(lastline)]])
ValueError: could not convert string to float: AssertionError