louishenrifranc / hyperband

HyperBand implementation for Tensorflow models
0 stars 0 forks source link

Error when running the main script associating the inputs in the learning_iter proccess #1

Open dcalles opened 7 years ago

dcalles commented 7 years ago

Firstable, thanks for sharing a so practical and so well programmed code. I am trying to run it with Tensorflow 1.2, and I am not able to pass the learning iter method, It seems that Tensorflow can't accept the tensor given in the input as valid. I include the output to clarify.

Could you help me to try to run it? Thanks in advance!

python main.py Extracting data/MNIST_data\train-images-idx3-ubyte.gz Extracting data/MNIST_data\train-labels-idx1-ubyte.gz Extracting data/MNIST_data\t10k-images-idx3-ubyte.gz Extracting data/MNIST_data\t10k-labels-idx1-ubyte.gz [DEBUG HYPERBAND] s_max = 3, B = 120 [DEBUG HYPERBAND] Iteration s = 3 n = 27, r = 1.1111111111111112 2017-08-18 14:26:02.814491: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE instructions, but these are available on your machine and could speed up CPU computations. 2017-08-18 14:26:02.814807: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE2 instructions, but these are available on your machine and could speed up CPU computations. 2017-08-18 14:26:02.814948: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations. 2017-08-18 14:26:02.815050: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. 2017-08-18 14:26:02.815147: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. 2017-08-18 14:26:02.815228: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. 2017-08-18 14:26:02.815310: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. 2017-08-18 14:26:02.815481: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. 2017-08-18 14:26:03.084978: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:940] Found device 0 with properties: name: GeForce GTX 970 major: 5 minor: 2 memoryClockRate (GHz) 1.253 pciBusID 0000:01:00.0 Total memory: 4.00GiB Free memory: 3.31GiB 2017-08-18 14:26:03.085488: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:961] DMA: 0 2017-08-18 14:26:03.085717: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:971] 0: Y 2017-08-18 14:26:03.085820: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 970, pci bus id: 0000:01:00.0) Traceback (most recent call last): File "C:\Users\Arsis\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\client\session.py", line 942, in _run allow_operation=False) File "C:\Users\Arsis\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\framework\ops.py", line 2584, in as_graph_element return self._as_graph_element_locked(obj, allow_tensor, allow_operation) File "C:\Users\Arsis\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\framework\ops.py", line 2663, in _as_graph_element_locked raise ValueError("Tensor %s is not an element of this graph." % obj) ValueError: Tensor Tensor("use_batch_norm.true_lr.0.001_keep_prob.1_batch_size.64/Placeholder:0", shape=(?, 28, 28, 3), dtype=float32) is not an element of this graph.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "main.py", line 77, in tf.app.run() File "C:\Users\Arsis\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\platform\app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "main.py", line 63, in main cf.search() File "D:\IndFin\Negocios\Programacion\ML\TensorFlow\HP\hyperband\hpsearch\HyperBand.py", line 55, in search self.run_then_return_val_loss(models, r_i) File "D:\IndFin\Negocios\Programacion\ML\TensorFlow\HP\hyperband\hpsearch\HyperBand.py", line 90, in run_then_return_val_loss epoch_to_stop=r_i) File "D:\IndFin\Negocios\Programacion\ML\TensorFlow\HP\hyperband\models\basic_model.py", line 88, in train self.learning_iter(dataset, sess) File "D:\IndFin\Negocios\Programacion\ML\TensorFlow\HP\hyperband\models\cifar_model.py", line 80, in learning_iter self.labels: labels File "C:\Users\Arsis\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\client\session.py", line 789, in run run_metadata_ptr) File "C:\Users\Arsis\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\client\session.py", line 945, in _run

louishenrifranc commented 7 years ago

Hi @dcalles , thanks for pointing me the error. I think I mess up at some point with the code and haven't fixed it. If I find some time during the weekend I'll do it!

dcalles commented 7 years ago

Thanks for the response. If you need it, tell me in how I can help you.

dcalles commented 7 years ago

If you have time, or you can guide me some bit, please do so. Your aproach is the best I have seen for Hyperband and Tensor Flow and it would be a pity it couldn't be used anymore.

louishenrifranc commented 7 years ago

Hey, @dcalles, I've updated the code in #2, the algorithm is now running. I would be pleased if you use it, and chase other bugs, or even extend with new features. Hope you'll have some fun 👍

dcalles commented 7 years ago

Thanks @louishenrifranc, you have done an excellent work!!. Just for pointing out a little error I have found, you need to change the first line of the HyperBand.py to:

from models.mnist_model import MNISTModel

Apart from that, would it be possible to serialize the data in order to see the steps with tensorboard as it is explained in https://www.tensorflow.org/get_started/summaries_and_tensorboard ?

louishenrifranc commented 7 years ago

Yes, of course, I don't see why it should not be possible! I would do something like this:

louishenrifranc commented 7 years ago

Thanks for the found bug. I'll let the code in PR, because I haven't tested it enough. However I think you now have the tools to play with it and ensure that it is workin correctly. Looking for your pull request in case there is bug