rpautrat / SuperPoint

Efficient neural feature detector and descriptor
MIT License
1.88k stars 415 forks source link

Stuck at Step1 after iteration 0 #254

Closed Mmmelvil closed 2 years ago

Mmmelvil commented 2 years ago

Hi,

I am trying to use colab to implement the algorithm, after creating all the shapes for synthetic shape training, the training always got stuck after iteration 0, and it seemed that ^C automatically stopped the training? image

[05/23/2022 15:52:28 INFO] NumExpr defaulting to 2 threads. [05/23/2022 15:52:28 INFO] Running command TRAIN [05/23/2022 15:52:28 WARNING] From experiment.py:57: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead.

[05/23/2022 15:52:28 INFO] Number of GPUs detected: 1 [05/23/2022 15:52:31 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/datasets/synthetic_shapes.py:73: The name tf.logging.info is deprecated. Please use tf.compat.v1.logging.info instead.

[05/23/2022 15:52:31 INFO] Generating tarfile for primitive draw_lines. [05/23/2022 15:56:45 INFO] Tarfile dumped to /content/drive/MyDrive/SuperPoint/superpoint/data/synthetic_shapes_v6/draw_lines.tar.gz. [05/23/2022 15:56:45 INFO] Extracting archive for primitive draw_lines. [05/23/2022 15:56:49 INFO] Generating tarfile for primitive draw_polygon. [05/23/2022 16:01:02 INFO] Tarfile dumped to /content/drive/MyDrive/SuperPoint/superpoint/data/synthetic_shapes_v6/draw_polygon.tar.gz. [05/23/2022 16:01:02 INFO] Extracting archive for primitive draw_polygon. [05/23/2022 16:01:07 INFO] Generating tarfile for primitive draw_multiple_polygons. [05/23/2022 16:49:22 INFO] Tarfile dumped to /content/drive/MyDrive/SuperPoint/superpoint/data/synthetic_shapes_v6/draw_multiple_polygons.tar.gz. [05/23/2022 16:49:22 INFO] Extracting archive for primitive draw_multiple_polygons. [05/23/2022 16:49:27 INFO] Generating tarfile for primitive draw_ellipses. [05/23/2022 16:53:48 INFO] Tarfile dumped to /content/drive/MyDrive/SuperPoint/superpoint/data/synthetic_shapes_v6/draw_ellipses.tar.gz. [05/23/2022 16:53:48 INFO] Extracting archive for primitive draw_ellipses. [05/23/2022 16:53:53 INFO] Generating tarfile for primitive draw_star. [05/23/2022 16:58:07 INFO] Tarfile dumped to /content/drive/MyDrive/SuperPoint/superpoint/data/synthetic_shapes_v6/draw_star.tar.gz. [05/23/2022 16:58:07 INFO] Extracting archive for primitive draw_star. [05/23/2022 16:58:12 INFO] Generating tarfile for primitive draw_checkerboard. [05/23/2022 17:02:40 INFO] Tarfile dumped to /content/drive/MyDrive/SuperPoint/superpoint/data/synthetic_shapes_v6/draw_checkerboard.tar.gz. [05/23/2022 17:02:40 INFO] Extracting archive for primitive draw_checkerboard. [05/23/2022 17:02:45 INFO] Generating tarfile for primitive draw_stripes. [05/23/2022 17:07:04 INFO] Tarfile dumped to /content/drive/MyDrive/SuperPoint/superpoint/data/synthetic_shapes_v6/draw_stripes.tar.gz. [05/23/2022 17:07:04 INFO] Extracting archive for primitive draw_stripes. [05/23/2022 17:07:09 INFO] Generating tarfile for primitive draw_cube. [05/23/2022 17:11:27 INFO] Tarfile dumped to /content/drive/MyDrive/SuperPoint/superpoint/data/synthetic_shapes_v6/draw_cube.tar.gz. [05/23/2022 17:11:27 INFO] Extracting archive for primitive draw_cube. [05/23/2022 17:11:31 INFO] Generating tarfile for primitive gaussian_noise. [05/23/2022 17:16:17 INFO] Tarfile dumped to /content/drive/MyDrive/SuperPoint/superpoint/data/synthetic_shapes_v6/gaussian_noise.tar.gz. [05/23/2022 17:16:17 INFO] Extracting archive for primitive gaussian_noise. [05/23/2022 17:16:23 WARNING] Entity <function SyntheticShapes._get_data.. at 0x7f11a5c8da70> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: Failed to parse source code of <function SyntheticShapes._get_data.. at 0x7f11a5c8da70>, which Python reported as: lambda image, points:

If this is a lambda function, the error may be avoided by creating the lambda in a standalone statement. [05/23/2022 17:16:23 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/datasets/synthetic_shapes.py:167: The name tf.read_file is deprecated. Please use tf.io.read_file instead.

[05/23/2022 17:16:23 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/datasets/synthetic_shapes.py:189: py_func (from tensorflow.python.ops.script_ops) is deprecated and will be removed in a future version. Instructions for updating: tf.py_func is deprecated in TF V2. Instead, there are two options available in V2.

[05/23/2022 17:16:23 WARNING] Entity <function SyntheticShapes._get_data.. at 0x7f11a227d290> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: module 'gast' has no attribute 'Str' [05/23/2022 17:16:23 WARNING] Entity <function SyntheticShapes._get_data.. at 0x7f11a2a85560> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: module 'gast' has no attribute 'Str' [05/23/2022 17:16:23 WARNING] Entity <function add_dummy_valid_mask at 0x7f11a5c7aa70> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: module 'gast' has no attribute 'Index' [05/23/2022 17:16:23 INFO] Caching data, fist access will take some time. [05/23/2022 17:16:23 WARNING] Entity <function SyntheticShapes._get_data.. at 0x7f11a2a85710> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: module 'gast' has no attribute 'Index' [05/23/2022 17:16:23 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/datasets/utils/pipeline.py:25: The name tf.random_shuffle is deprecated. Please use tf.random.shuffle instead.

[05/23/2022 17:16:23 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/datasets/utils/photometric_augmentation.py:24: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

[05/23/2022 17:16:23 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/datasets/utils/photometric_augmentation.py:26: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.where in 2.0, which has the same broadcast rule as np.where [05/23/2022 17:16:23 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/datasets/utils/photometric_augmentation.py:18: The name tf.random_normal is deprecated. Please use tf.random.normal instead.

2022-05-23 17:16:23.465628: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1 2022-05-23 17:16:23.527108: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-05-23 17:16:23.527923: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 0 with properties: name: Tesla P100-PCIE-16GB major: 6 minor: 0 memoryClockRate(GHz): 1.3285 pciBusID: 0000:00:04.0 2022-05-23 17:16:23.538259: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1 2022-05-23 17:16:23.721081: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10 2022-05-23 17:16:23.750506: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10 2022-05-23 17:16:23.773563: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10 2022-05-23 17:16:23.968292: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10 2022-05-23 17:16:23.984812: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10 2022-05-23 17:16:24.299461: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2022-05-23 17:16:24.299766: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-05-23 17:16:24.300596: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-05-23 17:16:24.301271: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1767] Adding visible gpu devices: 0 [05/23/2022 17:16:24 WARNING] Entity <function SyntheticShapes._get_data.. at 0x7f11a057b320> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: module 'gast' has no attribute 'Index' [05/23/2022 17:16:24 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/homographies.py:164: The name tf.truncated_normal is deprecated. Please use tf.random.truncated_normal instead.

[05/23/2022 17:16:24 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/homographies.py:201: The name tf.lin_space is deprecated. Please use tf.linspace instead.

[05/23/2022 17:16:24 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/homographies.py:218: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. [05/23/2022 17:16:24 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/homographies.py:229: The name tf.matrix_solve_ls is deprecated. Please use tf.linalg.lstsq instead.

[05/23/2022 17:16:24 WARNING] The TensorFlow contrib module will not be included in TensorFlow 2.0. For more information, please see:

[05/23/2022 17:16:24 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/homographies.py:277: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. [05/23/2022 17:16:24 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/homographies.py:237: The name tf.matrix_inverse is deprecated. Please use tf.linalg.inv instead.

[05/23/2022 17:16:24 WARNING] Entity <function add_keypoint_map at 0x7f11a5c7ad40> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: module 'gast' has no attribute 'Index' [05/23/2022 17:16:24 WARNING] Entity <function SyntheticShapes._get_data.. at 0x7f11a13e6290> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: module 'gast' has no attribute 'Index' [05/23/2022 17:16:24 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/datasets/base_dataset.py:109: DatasetV1.make_one_shot_iterator (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version. Instructions for updating: Use for ... in dataset: to iterate over a dataset. If using tf.estimator, return the Dataset object directly from your input function. As a last resort, you can use tf.compat.v1.data.make_one_shot_iterator(dataset). [05/23/2022 17:16:24 WARNING] Entity <function SyntheticShapes._get_data.. at 0x7f11a13e6ef0> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: Failed to parse source code of <function SyntheticShapes._get_data.. at 0x7f11a13e6ef0>, which Python reported as: lambda image, points:

If this is a lambda function, the error may be avoided by creating the lambda in a standalone statement. [05/23/2022 17:16:24 WARNING] Entity <function SyntheticShapes._get_data.. at 0x7f11a13e6b00> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: module 'gast' has no attribute 'Str' [05/23/2022 17:16:24 WARNING] Entity <function SyntheticShapes._get_data.. at 0x7f11a14de050> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: module 'gast' has no attribute 'Str' [05/23/2022 17:16:25 INFO] Caching data, fist access will take some time. [05/23/2022 17:16:25 WARNING] Entity <function SyntheticShapes._get_data.. at 0x7f11a14deef0> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: module 'gast' has no attribute 'Index' [05/23/2022 17:16:25 WARNING] Entity <function SyntheticShapes._get_data.. at 0x7f11a03ecd40> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: Failed to parse source code of <function SyntheticShapes._get_data.. at 0x7f11a03ecd40>, which Python reported as: lambda image, points:

If this is a lambda function, the error may be avoided by creating the lambda in a standalone statement. [05/23/2022 17:16:25 WARNING] Entity <function SyntheticShapes._get_data.. at 0x7f11a03ec7a0> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: module 'gast' has no attribute 'Str' [05/23/2022 17:16:25 WARNING] Entity <function SyntheticShapes._get_data.. at 0x7f11a03eccb0> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: module 'gast' has no attribute 'Str' [05/23/2022 17:16:25 INFO] Caching data, fist access will take some time. [05/23/2022 17:16:25 WARNING] Entity <function SyntheticShapes._get_data.. at 0x7f11a24cbb90> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: module 'gast' has no attribute 'Index' [05/23/2022 17:16:25 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/datasets/base_dataset.py:111: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

2022-05-23 17:16:25.169508: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2199995000 Hz 2022-05-23 17:16:25.170059: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x68ac540 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2022-05-23 17:16:25.170098: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2022-05-23 17:16:25.455001: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-05-23 17:16:25.456056: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x68aea00 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices: 2022-05-23 17:16:25.456101: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Tesla P100-PCIE-16GB, Compute Capability 6.0 2022-05-23 17:16:25.456630: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-05-23 17:16:25.457372: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 0 with properties: name: Tesla P100-PCIE-16GB major: 6 minor: 0 memoryClockRate(GHz): 1.3285 pciBusID: 0000:00:04.0 2022-05-23 17:16:25.457510: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1 2022-05-23 17:16:25.457538: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10 2022-05-23 17:16:25.457553: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10 2022-05-23 17:16:25.457564: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10 2022-05-23 17:16:25.457575: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10 2022-05-23 17:16:25.457588: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10 2022-05-23 17:16:25.457601: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2022-05-23 17:16:25.457717: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-05-23 17:16:25.458479: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-05-23 17:16:25.459251: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1767] Adding visible gpu devices: 0 2022-05-23 17:16:25.465331: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1 2022-05-23 17:16:25.466970: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1180] Device interconnect StreamExecutor with strength 1 edge matrix: 2022-05-23 17:16:25.467003: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1186] 0 2022-05-23 17:16:25.467014: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1199] 0: N 2022-05-23 17:16:25.467485: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-05-23 17:16:25.468248: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-05-23 17:16:25.468898: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0. 2022-05-23 17:16:25.469001: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1325] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15224 MB memory) -> physical GPU (device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0) [05/23/2022 17:16:25 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/base_model.py:105: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

[05/23/2022 17:16:25 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/base_model.py:121: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

[05/23/2022 17:16:25 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/base_model.py:121: The name tf.AUTO_REUSE is deprecated. Please use tf.compat.v1.AUTO_REUSE instead.

[05/23/2022 17:16:25 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/base_model.py:232: DatasetV1.output_shapes (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.compat.v1.data.get_output_shapes(dataset). [05/23/2022 17:16:25 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/base_model.py:242: DatasetV1.output_types (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.compat.v1.data.get_output_types(dataset). [05/23/2022 17:16:25 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/base_model.py:241: DatasetV1.make_initializable_iterator (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version. Instructions for updating: Use for ... in dataset: to iterate over a dataset. If using tf.estimator, return the Dataset object directly from your input function. As a last resort, you can use tf.compat.v1.data.make_initializable_iterator(dataset). [05/23/2022 17:16:25 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/base_model.py:257: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

[05/23/2022 17:16:25 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/base_model.py:258: The name tf.data.Iterator is deprecated. Please use tf.compat.v1.data.Iterator instead.

[05/23/2022 17:16:25 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/base_model.py:153: The name tf.train.replica_device_setter is deprecated. Please use tf.compat.v1.train.replica_device_setter instead.

[05/23/2022 17:16:25 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:25 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/backbones/vgg.py:10: conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras.layers.Conv2D instead. [05/23/2022 17:16:25 WARNING] From /tensorflow-1.15.2/python3.7/tensorflow_core/python/layers/convolutional.py:424: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version. Instructions for updating: Please use layer.__call__ method instead. [05/23/2022 17:16:25 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/backbones/vgg.py:14: batch_normalization (from tensorflow.python.layers.normalization) is deprecated and will be removed in a future version. Instructions for updating: Use keras.layers.BatchNormalization instead. In particular, tf.control_dependencies(tf.GraphKeys.UPDATE_OPS) should not be used (consult the tf.keras.layers.batch_normalization documentation). [05/23/2022 17:16:25 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:25 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/backbones/vgg.py:28: max_pooling2d (from tensorflow.python.layers.pooling) is deprecated and will be removed in a future version. Instructions for updating: Use keras.layers.MaxPooling2D instead. [05/23/2022 17:16:25 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:25 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:25 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:25 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:25 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:26 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:26 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:26 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:26 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/utils.py:24: The name tf.depth_to_space is deprecated. Please use tf.compat.v1.depth_to_space instead.

[05/23/2022 17:16:26 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/utils.py:57: The name tf.space_to_depth is deprecated. Please use tf.compat.v1.space_to_depth instead.

[05/23/2022 17:16:26 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/utils.py:70: The name tf.losses.sparse_softmax_cross_entropy is deprecated. Please use tf.compat.v1.losses.sparse_softmax_cross_entropy instead.

[05/23/2022 17:16:26 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/base_model.py:161: The name tf.get_collection is deprecated. Please use tf.compat.v1.get_collection instead.

[05/23/2022 17:16:26 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/base_model.py:161: The name tf.GraphKeys is deprecated. Please use tf.compat.v1.GraphKeys instead.

[05/23/2022 17:16:26 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/base_model.py:163: The name tf.trainable_variables is deprecated. Please use tf.compat.v1.trainable_variables instead.

[05/23/2022 17:16:26 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/base_model.py:206: The name tf.summary.scalar is deprecated. Please use tf.compat.v1.summary.scalar instead.

[05/23/2022 17:16:26 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/base_model.py:210: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead.

[05/23/2022 17:16:26 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:26 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:26 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:27 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:27 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:27 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:27 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:27 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:27 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:27 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:27 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/base_model.py:266: The name tf.summary.merge_all is deprecated. Please use tf.compat.v1.summary.merge_all instead.

[05/23/2022 17:16:27 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:27 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:27 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:27 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:27 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:27 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:27 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:27 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:27 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:27 INFO] Scale of 0 disables regularizer. [05/23/2022 17:16:27 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/base_model.py:276: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

2022-05-23 17:16:27.306966: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-05-23 17:16:27.307745: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 0 with properties: name: Tesla P100-PCIE-16GB major: 6 minor: 0 memoryClockRate(GHz): 1.3285 pciBusID: 0000:00:04.0 2022-05-23 17:16:27.307900: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1 2022-05-23 17:16:27.307965: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10 2022-05-23 17:16:27.308004: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10 2022-05-23 17:16:27.308030: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10 2022-05-23 17:16:27.308053: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10 2022-05-23 17:16:27.308077: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10 2022-05-23 17:16:27.308106: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2022-05-23 17:16:27.308230: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-05-23 17:16:27.308964: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-05-23 17:16:27.309577: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1767] Adding visible gpu devices: 0 2022-05-23 17:16:27.309639: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1180] Device interconnect StreamExecutor with strength 1 edge matrix: 2022-05-23 17:16:27.309662: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1186] 0 2022-05-23 17:16:27.309672: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1199] 0: N 2022-05-23 17:16:27.309802: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-05-23 17:16:27.310457: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-05-23 17:16:27.311104: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1325] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15224 MB memory) -> physical GPU (device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0) [05/23/2022 17:16:29 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/base_model.py:287: The name tf.global_variables_initializer is deprecated. Please use tf.compat.v1.global_variables_initializer instead.

[05/23/2022 17:16:29 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/base_model.py:288: The name tf.local_variables_initializer is deprecated. Please use tf.compat.v1.local_variables_initializer instead.

[05/23/2022 17:16:29 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/base_model.py:295: The name tf.summary.FileWriter is deprecated. Please use tf.compat.v1.summary.FileWriter instead.

[05/23/2022 17:16:29 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/base_model.py:298: The name tf.train.Saver is deprecated. Please use tf.compat.v1.train.Saver instead.

[05/23/2022 17:16:30 INFO] Start training 2022-05-23 17:16:34.360102: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2022-05-23 17:16:37.381063: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10 [05/23/2022 17:17:43 INFO] Iter 0: loss 4.6654, precision 0.0005, recall 0.0520 [05/23/2022 17:17:43 WARNING] From /content/drive/MyDrive/SuperPoint/superpoint/models/base_model.py:325: The name tf.Summary is deprecated. Please use tf.compat.v1.Summary instead.

^C

Did anyone get this problem? And how should I solve this? Really appreciate!

Mmmelvil commented 2 years ago

I switched tensorflow version to 1.15.2 and right now I am assigned with a Tesla P100-PCIE-16GB GPU on colab.

rpautrat commented 2 years ago

Hi, I have never tried to run it on Colab, so I don't know what is happening there. But maybe the script is creating too large files for Colab, or the memory limit is exceeded and the program killed.

Mmmelvil commented 2 years ago

Hi @rpautrat ,

This might be the issue with colab. Right now I am using my own GPU, and it seemed working. But I am getting a pretty low precision and recall with modifying the config file (batch-size =1 , eval-batch-size =1). Could it be the reason? Is there any way to improve these?

$ python experiment.py train configs/magic-point_shapes.yaml magic-point_synth /home/jinchenzeng/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:523: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/jinchenzeng/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:524: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/jinchenzeng/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/jinchenzeng/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/jinchenzeng/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/jinchenzeng/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:532: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) [05/24/2022 01:48:35 INFO] Running command TRAIN [05/24/2022 01:48:35 INFO] Number of GPUs detected: 1 [05/24/2022 01:48:38 INFO] Extracting archive for primitive draw_lines. [05/24/2022 01:48:43 INFO] Extracting archive for primitive draw_polygon. [05/24/2022 01:48:48 INFO] Extracting archive for primitive draw_multiple_polygons. [05/24/2022 01:48:53 INFO] Extracting archive for primitive draw_ellipses. [05/24/2022 01:48:58 INFO] Extracting archive for primitive draw_star. [05/24/2022 01:49:04 INFO] Extracting archive for primitive draw_checkerboard. [05/24/2022 01:49:09 INFO] Extracting archive for primitive draw_stripes. [05/24/2022 01:49:13 INFO] Extracting archive for primitive draw_cube. [05/24/2022 01:49:18 INFO] Extracting archive for primitive gaussian_noise. [05/24/2022 01:49:25 INFO] Caching data, fist access will take some time. [05/24/2022 01:49:28 INFO] Caching data, fist access will take some time. [05/24/2022 01:49:28 INFO] Caching data, fist access will take some time. 2022-05-24 01:49:28.383391: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2022-05-24 01:49:28.497675: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-05-24 01:49:28.498084: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties: name: Quadro P2000 major: 6 minor: 1 memoryClockRate(GHz): 1.607 pciBusID: 0000:01:00.0 totalMemory: 3.95GiB freeMemory: 3.56GiB 2022-05-24 01:49:28.498106: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0 2022-05-24 01:49:28.830833: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix: 2022-05-24 01:49:28.830870: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0 2022-05-24 01:49:28.830878: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N 2022-05-24 01:49:28.830966: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:42] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0. 2022-05-24 01:49:28.831010: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3286 MB memory) -> physical GPU (device: 0, name: Quadro P2000, pci bus id: 0000:01:00.0, compute capability: 6.1) [05/24/2022 01:49:30 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:30 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:30 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:30 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:30 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:30 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:30 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:30 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:30 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:30 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:32 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:32 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:32 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:32 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:32 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:32 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:32 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:32 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:32 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:32 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:32 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:32 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:32 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:32 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:32 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:32 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:32 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:32 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:32 INFO] Scale of 0 disables regularizer. [05/24/2022 01:49:32 INFO] Scale of 0 disables regularizer. 2022-05-24 01:49:33.100208: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0 2022-05-24 01:49:33.100253: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix: 2022-05-24 01:49:33.100262: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0 2022-05-24 01:49:33.100269: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N 2022-05-24 01:49:33.100346: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3286 MB memory) -> physical GPU (device: 0, name: Quadro P2000, pci bus id: 0000:01:00.0, compute capability: 6.1) [05/24/2022 01:49:37 INFO] Start training [05/24/2022 01:51:04 INFO] Iter 0: loss 4.7076, precision 0.0006, recall 0.0532 [05/24/2022 01:52:14 INFO] Iter 1000: loss 1.0714, precision 0.0006, recall 0.0533 [05/24/2022 01:53:36 INFO] Iter 2000: loss 0.3559, precision 0.0006, recall 0.0570 [05/24/2022 01:54:47 INFO] Iter 3000: loss 0.1404, precision 0.0006, recall 0.0606 [05/24/2022 01:55:08 INFO] Iter 4000: loss 0.8685, precision 0.0034, recall 0.0920 [05/24/2022 01:55:25 INFO] Iter 5000: loss 0.1155, precision 0.0118, recall 0.0955 [05/24/2022 01:55:43 INFO] Iter 6000: loss 0.0795, precision 0.0132, recall 0.0851 [05/24/2022 01:56:01 INFO] Iter 7000: loss 0.5342, precision 0.0159, recall 0.1197 [05/24/2022 01:56:19 INFO] Iter 8000: loss 0.1786, precision 0.0176, recall 0.1552 [05/24/2022 01:56:36 INFO] Iter 9000: loss 0.0585, precision 0.0223, recall 0.1415 [05/24/2022 01:56:54 INFO] Iter 10000: loss 0.1004, precision 0.0261, recall 0.1487 [05/24/2022 01:57:11 INFO] Iter 11000: loss 0.1894, precision 0.0421, recall 0.1652 [05/24/2022 01:57:28 INFO] Iter 12000: loss 0.2891, precision 0.0439, recall 0.1507 [05/24/2022 01:57:46 INFO] Iter 13000: loss 0.2029, precision 0.0492, recall 0.2077 [05/24/2022 01:58:05 INFO] Iter 14000: loss 0.5822, precision 0.0202, recall 0.1601 [05/24/2022 01:58:23 INFO] Iter 15000: loss 0.0506, precision 0.0555, recall 0.1931 [05/24/2022 01:58:40 INFO] Iter 16000: loss 0.1161, precision 0.0896, recall 0.2083 [05/24/2022 01:58:56 INFO] Iter 17000: loss 0.0262, precision 0.0782, recall 0.2221 [05/24/2022 01:59:13 INFO] Iter 18000: loss 0.1176, precision 0.0779, recall 0.2117 [05/24/2022 01:59:30 INFO] Iter 19000: loss 0.0941, precision 0.0730, recall 0.2267 [05/24/2022 01:59:46 INFO] Iter 20000: loss 0.0941, precision 0.0558, recall 0.2239 [05/24/2022 02:00:03 INFO] Iter 21000: loss 0.5136, precision 0.0953, recall 0.2394 [05/24/2022 02:00:20 INFO] Iter 22000: loss 0.2952, precision 0.1046, recall 0.2552 [05/24/2022 02:00:36 INFO] Iter 23000: loss 0.2515, precision 0.1060, recall 0.2638 [05/24/2022 02:00:53 INFO] Iter 24000: loss 0.3637, precision 0.1274, recall 0.2692 [05/24/2022 02:01:10 INFO] Iter 25000: loss 0.3768, precision 0.1381, recall 0.2985 [05/24/2022 02:01:29 INFO] Iter 26000: loss 0.6720, precision 0.0728, recall 0.2584 [05/24/2022 02:01:46 INFO] Iter 27000: loss 0.0576, precision 0.1190, recall 0.2662 [05/24/2022 02:02:04 INFO] Iter 28000: loss 0.1419, precision 0.1407, recall 0.2573 [05/24/2022 02:02:22 INFO] Iter 29000: loss 0.0801, precision 0.0752, recall 0.2568 [05/24/2022 02:02:40 INFO] Iter 30000: loss 0.0655, precision 0.1564, recall 0.3023 [05/24/2022 02:02:59 INFO] Iter 31000: loss 0.0353, precision 0.0699, recall 0.2785 [05/24/2022 02:03:16 INFO] Iter 32000: loss 0.0476, precision 0.1344, recall 0.3260 [05/24/2022 02:03:34 INFO] Iter 33000: loss 0.0842, precision 0.1703, recall 0.3221 [05/24/2022 02:03:51 INFO] Iter 34000: loss 0.0911, precision 0.1437, recall 0.3289 [05/24/2022 02:04:09 INFO] Iter 35000: loss 0.5261, precision 0.0814, recall 0.2852 [05/24/2022 02:04:27 INFO] Iter 36000: loss 0.0518, precision 0.1339, recall 0.2789 [05/24/2022 02:04:45 INFO] Iter 37000: loss 0.1285, precision 0.1529, recall 0.3287 [05/24/2022 02:05:03 INFO] Iter 38000: loss 0.0792, precision 0.1675, recall 0.3521 [05/24/2022 02:05:20 INFO] Iter 39000: loss 0.3377, precision 0.1538, recall 0.3066 [05/24/2022 02:05:38 INFO] Iter 40000: loss 0.1101, precision 0.1340, recall 0.3019 [05/24/2022 02:05:56 INFO] Iter 41000: loss 0.0944, precision 0.1776, recall 0.3211 [05/24/2022 02:06:13 INFO] Iter 42000: loss 0.1334, precision 0.1576, recall 0.3009 [05/24/2022 02:06:31 INFO] Iter 43000: loss 0.0161, precision 0.1378, recall 0.2979 [05/24/2022 02:06:48 INFO] Iter 44000: loss 0.0781, precision 0.1978, recall 0.3627 [05/24/2022 02:07:06 INFO] Iter 45000: loss 0.1168, precision 0.1409, recall 0.2961 [05/24/2022 02:07:32 INFO] Iter 46000: loss 0.0655, precision 0.0222, recall 0.2834 [05/24/2022 02:07:50 INFO] Iter 47000: loss 0.1208, precision 0.0880, recall 0.3119 [05/24/2022 02:08:07 INFO] Iter 48000: loss 0.1304, precision 0.1535, recall 0.3557 [05/24/2022 02:08:24 INFO] Iter 49000: loss 0.0346, precision 0.2210, recall 0.3613 [05/24/2022 02:08:39 INFO] Training finished [05/24/2022 02:08:39 INFO] Saving checkpoint for iteration #50000 2022-05-24 02:08:39.605688: W tensorflow/core/kernels/data/cache_dataset_ops.cc:770] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the datasetwill be discarded. This can happen if you have an input pipeline similar to dataset.cache().take(k).repeat(). You should use dataset.take(k).cache().repeat() instead.

Mmmelvil commented 2 years ago

Hi Remi @rpautrat,

I changed some configurations on colab, first I switched python to 3.6 then used tensorflow 1.14, the code can be run successfully. Now I got around 0.4 precision with the synthetic shape training. I am closing the issue now. Thanks a lot!!

rpautrat commented 2 years ago

Great to hear, and thanks for reporting the working configuration!