tensorpack / tensorpack

A Neural Net Training Interface on TensorFlow, with focus on speed + flexibility
Apache License 2.0
6.3k stars 1.81k forks source link

Problem with PReLU #501

Closed chaow94 closed 6 years ago

chaow94 commented 6 years ago

I want to change tf.nn.relu to PReLU in alexnet-dorefa.(PRELU function in tensorpack/models/nonlin.py) Wen i use from tensorpack.models import PReLU it doesn't work . and from tensorpack.models.nonlin import PReLU

` def nonlin(x): if BITA == 32:

return tf.nn.relu(x) # still use relu for 32bit cases

             return PReLU(x)
        return tf.clip_by_value(x, 0.0, 1.0)

    def activate(x):
        return fa(nonlin(x)`

it doesn't work too . log: Failed to load OpenCL runtime (expected version 1.1+) [1121 13:53:19 @alexnet-dorefa.py:305] Batch per tower: 128 [1121 13:53:19 @logger.py:94] WRN Log directory train_log/alexnet-dorefa exists! Please either backup/delete it, or use a new directory. [1121 13:53:19 @logger.py:96] WRN If you're resuming from a previous run you can choose to keep it. [1121 13:53:19 @logger.py:97] Select Action: k (keep) / b (backup) / d (delete) / n (new) / q (quit): d [1121 13:53:22 @logger.py:74] Argv: alexnet-dorefa.py --dorefa 1,32,32 --data /resources/data/ILSVRC2012/images/ --gpu 0 [1121 13:53:23 @fs.py:89] WRN Env var $TENSORPACK_DATASET not set, using /home/xxxx/tensorpack_data for datasets. [1121 13:53:25 @prefetch.py:263] [PrefetchDataZMQ] Will fork a dataflow more than one times. This assumes the datapoints are i.i.d. [1121 13:53:25 @ilsvrc.py:118] Assuming directory /resources/data/ILSVRC2012/images/val has original structure. [1121 13:53:25 @inference_runner.py:82] InferenceRunner will eval on an InputSource of size 391 [1121 13:53:25 @input_source.py:180] Setting up the queue 'QueueInput/input_queue' for CPU prefetching ... [1121 13:53:25 @training.py:90] Building graph for training tower 0 on device LeastLoadedDeviceSetter-/gpu:0... [1121 13:53:25 @registry.py:121] conv0 input: [None, 224, 224, 3] [1121 13:53:25 @registry.py:129] conv0 output: [None, 54, 54, 96] Traceback (most recent call last): File "alexnet-dorefa.py", line 310, in launch_train_with_config(config, SyncMultiGPUTrainer(nr_tower)) File "/home/xxxx/.local/lib/python2.7/site-packages/tensorpack/train/interface.py", line 88, in launch_train_with_config model._build_graph_get_cost, model.get_optimizer) File "/home/xxxx/.local/lib/python2.7/site-packages/tensorpack/utils/argtools.py", line 165, in wrapper return func(*args, kwargs) File "/home/xxxx/.local/lib/python2.7/site-packages/tensorpack/train/tower.py", line 137, in setup_graph train_callbacks = self._setup_graph(input, get_cost_fn, get_opt_fn) File "/home/xxxx/.local/lib/python2.7/site-packages/tensorpack/train/trainers.py", line 79, in _setup_graph self._make_get_grad_fn(input, get_cost_fn, get_opt_fn), get_opt_fn) File "/home/xxxx/.local/lib/python2.7/site-packages/tensorpack/graph_builder/training.py", line 137, in build grad_list = DataParallelBuilder.build_on_towers(self.towers, get_grad_fn, devices) File "/home/xxxx/.local/lib/python2.7/site-packages/tensorpack/graph_builder/training.py", line 95, in build_on_towers ret.append(func()) File "/home/xxxx/.local/lib/python2.7/site-packages/tensorpack/train/tower.py", line 166, in get_grad_fn cost = get_cost_fn(input.get_input_tensors()) File "/home/xxxx/.local/lib/python2.7/site-packages/tensorpack/tfutils/tower.py", line 198, in call output = self._tower_fn(args) File "/home/xxxx/.local/lib/python2.7/site-packages/tensorpack/graph_builder/model_desc.py", line 169, in _build_graph_get_cost self.build_graph(inputs) File "/home/xxxx/.local/lib/python2.7/site-packages/tensorpack/graph_builder/model_desc.py", line 119, in build_graph self._build_graph(inputs) File "alexnet-dorefa.py", line 123, in _build_graph .apply(activate) File "/home/xxxx/.local/lib/python2.7/site-packages/tensorpack/models/linearwrap.py", line 75, in apply ret = func(self._t, args, kwargs) File "alexnet-dorefa.py", line 113, in activate return fa(nonlin(x)) File "alexnet-dorefa.py", line 105, in nonlin return PReLU(x) File "/home/xxxx/.local/lib/python2.7/site-packages/tensorpack/models/registry.py", line 83, in wrapped_func name, inputs = args[0], args[1] IndexError: tuple index out of range [1]+ Killed python alexnet-dorefa.py --dorefa 1,32,32 --data /resources/data/ILSVRC2012/images/ --gpu 0 Thanks ~

ppwwyyxx commented 6 years ago

PReLU takes two arguments. See http://tensorpack.readthedocs.io/en/latest/modules/models.html#tensorpack.models.PReLU.

chaow94 commented 6 years ago

Thanks .

chaow94 commented 6 years ago

Sir, I use return PReLU('prelu',x) and it also doesnt run.

(tensorflow) Precision-Tower-7910:~/tensorflow/BNN/DoReFa-Net$ python alexnet-dorefa.py --dorefa 1,32,32 --data /resources/data/ILSVRC2012/images/ --gpu 0 Failed to load OpenCL runtime (expected version 1.1+) [1121 14:09:59 @alexnet-dorefa.py:305] Batch per tower: 128 [1121 14:09:59 @logger.py:94] WRN Log directory train_log/alexnet-dorefa exists! Please either backup/delete it, or use a new directory. [1121 14:09:59 @logger.py:96] WRN If you're resuming from a previous run you can choose to keep it. [1121 14:09:59 @logger.py:97] Select Action: k (keep) / b (backup) / d (delete) / n (new) / q (quit): d [1121 14:10:00 @logger.py:74] Argv: alexnet-dorefa.py --dorefa 1,32,32 --data /resources/data/ILSVRC2012/images/ --gpu 0 [1121 14:10:00 @fs.py:89] WRN Env var $TENSORPACK_DATASET not set, using /home/xxxxx/tensorpack_data for datasets. [1121 14:10:02 @prefetch.py:263] [PrefetchDataZMQ] Will fork a dataflow more than one times. This assumes the datapoints are i.i.d. [1121 14:10:02 @ilsvrc.py:118] Assuming directory /resources/data/ILSVRC2012/images/val has original structure. [1121 14:10:02 @inference_runner.py:82] InferenceRunner will eval on an InputSource of size 391 [1121 14:10:02 @input_source.py:180] Setting up the queue 'QueueInput/input_queue' for CPU prefetching ... [1121 14:10:02 @training.py:90] Building graph for training tower 0 on device LeastLoadedDeviceSetter-/gpu:0... [1121 14:10:02 @registry.py:121] conv0 input: [None, 224, 224, 3] [1121 14:10:02 @registry.py:129] conv0 output: [None, 54, 54, 96] [1121 14:10:02 @registry.py:121] conv1 input: [None, 54, 54, 96] [1121 14:10:02 @alexnet-dorefa.py:94] Binarizing weight conv1/W mul:0 [1121 14:10:02 @registry.py:129] conv1 output: [None, 54, 54, 256] [1121 14:10:02 @registry.py:121] pool1 input: [None, 54, 54, 256] [1121 14:10:02 @registry.py:129] pool1 output: [None, 27, 27, 256] Traceback (most recent call last): File "alexnet-dorefa.py", line 310, in launch_train_with_config(config, SyncMultiGPUTrainer(nr_tower)) File "/home/xxxxx/.local/lib/python2.7/site-packages/tensorpack/train/interface.py", line 88, in launch_train_with_config model._build_graph_get_cost, model.get_optimizer) File "/home/xxxxx/.local/lib/python2.7/site-packages/tensorpack/utils/argtools.py", line 165, in wrapper return func(*args, kwargs) File "/home/xxxxx/.local/lib/python2.7/site-packages/tensorpack/train/tower.py", line 137, in setup_graph train_callbacks = self._setup_graph(input, get_cost_fn, get_opt_fn) File "/home/xxxxx/.local/lib/python2.7/site-packages/tensorpack/train/trainers.py", line 79, in _setup_graph self._make_get_grad_fn(input, get_cost_fn, get_opt_fn), get_opt_fn) File "/home/xxxxx/.local/lib/python2.7/site-packages/tensorpack/graph_builder/training.py", line 137, in build grad_list = DataParallelBuilder.build_on_towers(self.towers, get_grad_fn, devices) File "/home/xxxxx/.local/lib/python2.7/site-packages/tensorpack/graph_builder/training.py", line 95, in build_on_towers ret.append(func()) File "/home/xxxxx/.local/lib/python2.7/site-packages/tensorpack/train/tower.py", line 166, in get_grad_fn cost = get_cost_fn(input.get_input_tensors()) File "/home/xxxxx/.local/lib/python2.7/site-packages/tensorpack/tfutils/tower.py", line 198, in call output = self._tower_fn(args) File "/home/xxxxx/.local/lib/python2.7/site-packages/tensorpack/graph_builder/model_desc.py", line 169, in _build_graph_get_cost self.build_graph(inputs) File "/home/xxxxx/.local/lib/python2.7/site-packages/tensorpack/graph_builder/model_desc.py", line 119, in build_graph self._build_graph(inputs) File "alexnet-dorefa.py", line 128, in _build_graph .apply(activate) File "/home/xxxxx/.local/lib/python2.7/site-packages/tensorpack/models/linearwrap.py", line 75, in apply ret = func(self._t, args, kwargs) File "alexnet-dorefa.py", line 113, in activate return fa(nonlin(x)) File "alexnet-dorefa.py", line 105, in nonlin return PReLU('prelu',x) File "/home/xxxxx/.local/lib/python2.7/site-packages/tensorpack/models/registry.py", line 124, in wrapped_func outputs = func(*args, *actual_args) File "/home/xxxxx/.local/lib/python2.7/site-packages/tensorpack/models/nonlin.py", line 55, in PReLU alpha = tf.get_variable('alpha', [], initializer=init) File "/home/xxxxx/.local/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 1065, in get_variable use_resource=use_resource, custom_getter=custom_getter) File "/home/xxxxx/.local/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 962, in get_variable use_resource=use_resource, custom_getter=custom_getter) File "/home/xxxxx/.local/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 360, in get_variable validate_shape=validate_shape, use_resource=use_resource) File "/home/xxxxx/.local/lib/python2.7/site-packages/tensorpack/tfutils/varreplace.py", line 53, in custom_getter v = getter(args, **kwargs) File "/home/xxxxx/.local/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 352, in _true_getter use_resource=use_resource) File "/home/xxxxx/.local/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 664, in _get_single_variable name, "".join(traceback.format_list(tb)))) ValueError: Variable prelu/alpha already exists, disallowed. Did you mean to set reuse=True in VarScope? Originally defined at:

File "/home/xxxxx/.local/lib/python2.7/site-packages/tensorpack/tfutils/varreplace.py", line 53, in custom_getter v = getter(*args, *kwargs) File "/home/xxxxx/.local/lib/python2.7/site-packages/tensorpack/models/nonlin.py", line 55, in PReLU alpha = tf.get_variable('alpha', [], initializer=init) File "/home/xxxxx/.local/lib/python2.7/site-packages/tensorpack/models/registry.py", line 124, in wrapped_func outputs = func(args, **actual_args)

chaow94 commented 6 years ago

I just change return tf.nn.relu(x) to return PReLU('prelu',x) in alexnet-dorefa.py .

ppwwyyxx commented 6 years ago

the line PReLU('prelu', x) got executed multiple times. You have to use different names each time.

chaow94 commented 6 years ago

Thanks.I can implement it when BITA ==32 . And PReLU in Activaton which is not 32 bit will make no difference. Am I right?

ppwwyyxx commented 6 years ago

You're the one who wrote things with PReLU and you'll make this decision by yourself.

chaow94 commented 6 years ago

OK ,Thanks.

Ancho5515 commented 5 years ago

@erdollar hello! I want to know how does your net work with prelu? can you share some results?