ethz-asl / hfnet

From Coarse to Fine: Robust Hierarchical Localization at Large Scale with HF-Net (https://arxiv.org/abs/1812.03506)
MIT License
772 stars 185 forks source link

NetVLAD Descriptors for Training #58

Open holestine opened 3 years ago

holestine commented 3 years ago

I've been able to export the SuperPoint descriptors but am getting an error when running the script to export the NetVLAD descriptors. It seems to be indicating that I have an incompatible device although I'm not sure what that could be, I have a SkyLake processor and GTX 1080 Ti graphics card. This occurs in base_model.py when executing

self.sess.run([tf.global_variables_initializer(), tf.local_variables_initializer()]). 

Any idea how to work around this issue? Complete output is below. thanks

Traceback (most recent call last): File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1356, in _do_call return fn(*args) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1339, in _run_fn self._extend_graph() File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1374, in _extend_graph tf_session.ExtendSession(self._session) tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation vgg16_netvlad_pca/average_rgb/Initializer/random_uniform/RandomUniform: Could not satisfy explicit device specification '' because the node {{colocation_node vgg16_netvlad_pca/average_rgb/Initializer/random_uniform/RandomUniform}} was colocated with a group of nodes that required incompatible device '/device:GPU:0'. All available devices [/job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0]. Colocation Debug Info: Colocation group had the following types and supported devices: Root Member(assigned_device_nameindex=-1 requested_devicename='/device:GPU:0' assigned_devicename='' resource_devicename='/device:GPU:0' supported_devicetypes=[CPU] possibledevices=[] Assign: CPU Identity: CPU XLA_CPU VariableV2: CPU Mul: CPU XLA_CPU Add: CPU XLA_CPU Sub: CPU XLA_CPU RandomUniform: CPU XLA_CPU Const: CPU XLA_CPU

Colocation members, user-requested devices, and framework assigned devices, if any: vgg16_netvlad_pca/average_rgb/Initializer/random_uniform/shape (Const) vgg16_netvlad_pca/average_rgb/Initializer/random_uniform/min (Const) vgg16_netvlad_pca/average_rgb/Initializer/random_uniform/max (Const) vgg16_netvlad_pca/average_rgb/Initializer/random_uniform/RandomUniform (RandomUniform) vgg16_netvlad_pca/average_rgb/Initializer/random_uniform/sub (Sub) vgg16_netvlad_pca/average_rgb/Initializer/random_uniform/mul (Mul) vgg16_netvlad_pca/average_rgb/Initializer/random_uniform (Add) vgg16_netvlad_pca/average_rgb (VariableV2) /device:GPU:0 vgg16_netvlad_pca/average_rgb/Assign (Assign) /device:GPU:0 vgg16_netvlad_pca/average_rgb/read (Identity) /device:GPU:0

     [[{{node vgg16_netvlad_pca/average_rgb/Initializer/random_uniform/RandomUniform}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/brian/.vscode/extensions/ms-python.python-2021.3.680753044/pythonFiles/lib/python/debugpy/main.py", line 45, in cli.main() File "/home/brian/.vscode/extensions/ms-python.python-2021.3.680753044/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 444, in main run() File "/home/brian/.vscode/extensions/ms-python.python-2021.3.680753044/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 285, in run_file runpy.run_path(target_as_str, run_name=compat.force_str("main")) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/runpy.py", line 263, in run_path pkg_name=pkg_name, script_name=fname) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/runpy.py", line 96, in _run_module_code mod_name, mod_spec, pkg_name, script_name) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "hfnet/export_predictions.py", line 59, in **config['model']) as net: File "/home/brian/Desktop/hfnet/hfnet/models/base_model.py", line 125, in init self._build_graph() File "/home/brian/Desktop/hfnet/hfnet/models/base_model.py", line 293, in _build_graph l]) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 950, in run run_metadata_ptr) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1173, in _run feed_dict_tensor, options, run_metadata) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1350, in _do_run run_metadata) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1370, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation vgg16_netvlad_pca/average_rgb/Initializer/random_uniform/RandomUniform: Could not satisfy explicit device specification '' because the node node vgg16_netvlad_pca/average_rgb/Initializer/random_uniform/RandomUniform (defined at /home/brian/Desktop/hfnet/hfnet/models/netvlad_original.py:78) placed on device Device assignments active during op 'vgg16_netvlad_pca/average_rgb/Initializer/random_uniform/RandomUniform' creation: with tf.device(None): </home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:1696> with tf.device(/gpu:0): </home/brian/Desktop/hfnet/hfnet/models/base_model.py:230> was colocated with a group of nodes that required incompatible device '/device:GPU:0'. All available devices [/job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0]. Colocation Debug Info: Colocation group had the following types and supported devices: Root Member(assigned_device_nameindex=-1 requested_devicename='/device:GPU:0' assigned_devicename='' resource_devicename='/device:GPU:0' supported_devicetypes=[CPU] possibledevices=[] Assign: CPU Identity: CPU XLA_CPU VariableV2: CPU Mul: CPU XLA_CPU Add: CPU XLA_CPU Sub: CPU XLA_CPU RandomUniform: CPU XLA_CPU Const: CPU XLA_CPU

Colocation members, user-requested devices, and framework assigned devices, if any: vgg16_netvlad_pca/average_rgb/Initializer/random_uniform/shape (Const) vgg16_netvlad_pca/average_rgb/Initializer/random_uniform/min (Const) vgg16_netvlad_pca/average_rgb/Initializer/random_uniform/max (Const) vgg16_netvlad_pca/average_rgb/Initializer/random_uniform/RandomUniform (RandomUniform) vgg16_netvlad_pca/average_rgb/Initializer/random_uniform/sub (Sub) vgg16_netvlad_pca/average_rgb/Initializer/random_uniform/mul (Mul) vgg16_netvlad_pca/average_rgb/Initializer/random_uniform (Add) vgg16_netvlad_pca/average_rgb (VariableV2) /device:GPU:0 vgg16_netvlad_pca/average_rgb/Assign (Assign) /device:GPU:0 vgg16_netvlad_pca/average_rgb/read (Identity) /device:GPU:0

     [[node vgg16_netvlad_pca/average_rgb/Initializer/random_uniform/RandomUniform (defined at /home/brian/Desktop/hfnet/hfnet/models/netvlad_original.py:78) ]]Additional information about colocations:No node-device colocations were active during op 'vgg16_netvlad_pca/average_rgb/Initializer/random_uniform/RandomUniform' creation.

Device assignments active during op 'vgg16_netvlad_pca/average_rgb/Initializer/random_uniform/RandomUniform' creation: with tf.device(None): </home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/ops/variables.py:1696> with tf.device(/gpu:0): </home/brian/Desktop/hfnet/hfnet/models/base_model.py:230>

Original stack trace for 'vgg16_netvlad_pca/average_rgb/Initializer/random_uniform/RandomUniform': File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/brian/.vscode/extensions/ms-python.python-2021.3.680753044/pythonFiles/lib/python/debugpy/main.py", line 45, in cli.main() File "/home/brian/.vscode/extensions/ms-python.python-2021.3.680753044/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 444, in main run() File "/home/brian/.vscode/extensions/ms-python.python-2021.3.680753044/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 285, in run_file runpy.run_path(target_as_str, run_name=compat.force_str("main")) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/runpy.py", line 263, in run_path pkg_name=pkg_name, script_name=fname) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/runpy.py", line 96, in _run_module_code mod_name, mod_spec, pkg_name, script_name) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "hfnet/export_predictions.py", line 59, in config['model']) as net: File "/home/brian/Desktop/hfnet/hfnet/models/base_model.py", line 125, in init self._build_graph() File "/home/brian/Desktop/hfnet/hfnet/models/base_model.py", line 276, in _build_graph self._pred_graph(self.pred_in) File "/home/brian/Desktop/hfnet/hfnet/models/base_model.py", line 231, in _pred_graph pred_out = self._model(data, Mode.PRED, self.config) File "/home/brian/Desktop/hfnet/hfnet/models/netvlad_original.py", line 78, in _model 'average_rgb', 3, dtype=image_batch.dtype) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 1496, in get_variable aggregation=aggregation) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 1239, in get_variable aggregation=aggregation) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 562, in get_variable aggregation=aggregation) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 514, in _true_getter aggregation=aggregation) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 929, in _get_single_variable aggregation=aggregation) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 259, in call return cls._variable_v1_call(*args, kwargs) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 220, in _variable_v1_call shape=shape) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 198, in previous_getter = lambda kwargs: default_variable_creator(None, kwargs) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 2511, in default_variable_creator shape=shape) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 263, in call return super(VariableMetaclass, cls).call(*args, *kwargs) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 1568, in init shape=shape) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 1698, in _init_from_args initial_value(), name="initial_value", dtype=dtype) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 901, in partition_info=partition_info) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/ops/init_ops.py", line 533, in call shape, -limit, limit, dtype, seed=self.seed) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/ops/random_ops.py", line 247, in random_uniform rnd = gen_random_ops.random_uniform(shape, dtype, seed=seed1, seed2=seed2) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/ops/gen_random_ops.py", line 820, in random_uniform name=name) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func return func(args, kwargs) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3616, in create_op op_def=op_def) File "/home/brian/miniconda3/envs/hfnet/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2005, in init self._traceback = tf_stack.extract_stack()

sarlinpe commented 3 years ago

Hi and sorry for the late reply. This is an old codebase and I am not providing much support for it. What version of Tensorflow are you using? Afaik this codebase only supports v1.12. It's unclear from the log whether TensorFlow was installed with GPU capability - it should. Could you check that in earlier logs?