thulab / DeepHash

An Open-Source Package for Deep Learning to Hash (DeepHash)
MIT License
556 stars 126 forks source link

DTQ: Activation of the Quantization loss #36

Closed hbellafkir closed 5 years ago

hbellafkir commented 5 years ago

hey, I started a training with considering the quantization loss by changing the cq_lambda value to 0.4 instead of 0 and I got this error only after 2 epochs:

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1278, in _do_call
    return fn(*args)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1263, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1350, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input is not invertible.
     [[Node: MatrixInverse_4 = MatrixInverse[T=DT_FLOAT, adjoint=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Add_5)]]
     [[Node: Assign_11/_4825 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_21_Assign_11", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "train_val_script.py", line 75, in <module>
    model_weights = model.train(train_img, database_img, query_img, args)
  File "/DeepHash/DeepHash/model/dtq/__init__.py", line 9, in train
    model.train_cq(img_train, img_query, img_database, config.R)
  File "/DeepHash/DeepHash/model/dtq/dtq.py", line 370, in train_cq
    maps = self.validation(img_query, img_database, R)
  File "/DeepHash/DeepHash/model/dtq/dtq.py", line 408, in validation
    self.update_codes_and_centers(img_database)
  File "/DeepHash/DeepHash/model/dtq/dtq.py", line 304, in update_codes_and_centers
    self.update_centers(img_dataset)
  File "/DeepHash/DeepHash/model/dtq/dtq.py", line 251, in update_centers
    self.img_b_all: img_dataset.codes,
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 877, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1100, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1272, in _do_run
    run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1291, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input is not invertible.
     [[Node: MatrixInverse_4 = MatrixInverse[T=DT_FLOAT, adjoint=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Add_5)]]
     [[Node: Assign_11/_4825 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_21_Assign_11", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Caused by op 'MatrixInverse_4', defined at:
  File "train_val_script.py", line 75, in <module>
    model_weights = model.train(train_img, database_img, query_img, args)
  File "/DeepHash/DeepHash/model/dtq/__init__.py", line 9, in train
    model.train_cq(img_train, img_query, img_database, config.R)
  File "/DeepHash/DeepHash/model/dtq/dtq.py", line 370, in train_cq
    maps = self.validation(img_query, img_database, R)
  File "/DeepHash/DeepHash/model/dtq/dtq.py", line 408, in validation
    self.update_codes_and_centers(img_database)
  File "/DeepHash/DeepHash/model/dtq/dtq.py", line 304, in update_codes_and_centers
    self.update_centers(img_dataset)
  File "/DeepHash/DeepHash/model/dtq/dtq.py", line 246, in update_centers
    compute_centers = tf.matmul(tf.matrix_inverse(hh), Uh)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_linalg_ops.py", line 1049, in matrix_inverse
    "MatrixInverse", input=input, adjoint=adjoint, name=name)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 454, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3155, in create_op
    op_def=op_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1717, in __init__
    self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Input is not invertible.
     [[Node: MatrixInverse_4 = MatrixInverse[T=DT_FLOAT, adjoint=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Add_5)]]
     [[Node: Assign_11/_4825 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_21_Assign_11", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Tensorflow version: 1.10.0

bl0 commented 5 years ago

Hi, from the error log, I think you can try to decrease the cq_lambda.

bl0 commented 5 years ago

Furthermore, I have fixed a bug in DTQ https://github.com/thulab/DeepHash/issues/35. May it help you.