yangxue0827 / RotationDetection

This is a tensorflow-based rotation detection benchmark, also called AlphaRotate.
https://rotationdetection.readthedocs.io/
Apache License 2.0
1.08k stars 182 forks source link

How to test train and test retinanet-gwd in HRSC2016 dataset? #42

Closed myGithubSiki closed 3 years ago

myGithubSiki commented 3 years ago

1.i have downloaded trained models by this project, then put them to $PATH_ROOT/output/pretained_weights. the pretained_weights is resnet_v1d.

  1. i have compiled .
    
    cd $PATH_ROOT/libs/utils/cython_utils
    rm *.so
    rm *.c
    rm *.cpp
    python setup.py build_ext --inplace (or make)

cd $PATH_ROOT/libs/utils/ rm .so rm .c rm *.cpp python setup.py build_ext --inplace


3. i have Copied $PATH_ROOT/libs/configs/HRSC2016/gwd/cfgs_res50_hrsc2016_gwd_v6.py  to$PATH_ROOT/libs/configs/cfgs.py
4. the structure directory of HRSC2016 Dataset
![image](https://user-images.githubusercontent.com/83753304/132119326-fb1a6626-9a1c-44b7-bb4b-defb66884050.png)

5.when i python   tools/gwd/train.py ,i got  some errors.

2021-09-05 07:49:28.459600: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at matching_files_op.cc:49 : Not found: ../../dataloader/tfrecord; No such file or directory
2021-09-05 07:49:28.534617: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at matching_files_op.cc:49 : Not found: ../../dataloader/tfrecord; No such file or directory
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
    return fn(*args)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn
    target_list, run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.NotFoundError: ../../dataloader/tfrecord; No such file or directory
     [[{{node get_batch/matching_filenames/MatchingFiles}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "train.py", line 160, in <module>
    trainer.main()
  File "train.py", line 155, in main
    self.log_printer(gwd, optimizer, global_step, tower_grads, total_loss_dict, num_gpu, graph)
  File "../../tools/train_base.py", line 196, in log_printer
    sess.run(init_op)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 956, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1180, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run
    run_metadata)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: ../../dataloader/tfrecord; No such file or directory
     [[node get_batch/matching_filenames/MatchingFiles (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]

Original stack trace for 'get_batch/matching_filenames/MatchingFiles':
  File "train.py", line 160, in <module>
    trainer.main()
  File "train.py", line 53, in main
    is_training=True)
  File "../../dataloader/dataset/read_tfrecord.py", line 115, in next_batch
    filename_tensorlist = tf.train.match_filenames_once(pattern)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/input.py", line 76, in match_filenames_once
    name=name, initial_value=io_ops.matching_files(pattern),
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gen_io_ops.py", line 464, in matching_files
    "MatchingFiles", pattern=pattern, name=name)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/deprecation.py", line 513, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op
    attrs, op_def, compute_device)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal
    op_def=op_def)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 1748, in __init__
    self._traceback = tf_stack.extract_stack()

Can you help me solve this problem? 
i hope your reply
yangxue0827 commented 3 years ago

image

myGithubSiki commented 3 years ago

ok! Thanks i have run it in Hrsc2016 Dataset . Thanks for your reply . It help me a lot

myGithubSiki commented 3 years ago

Hi Sorry to disturb you again, When i started to train , i got some errors.


2021-09-07 07:24:17: global_step:13881 current_step:3880 speed: 0.658s, remaining training time: 00:21:13:08 total_losses:2.562 cls_loss:1.574 reg_loss:0.988

2021-09-07 07:24:22.736337: W tensorflow/core/kernels/queue_base.cc:277] _2_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed 2021-09-07 07:24:22.736385: W tensorflow/core/kernels/queue_base.cc:277] _2_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed 2021-09-07 07:24:22.736401: W tensorflow/core/kernels/queue_base.cc:277] _2_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed 2021-09-07 07:24:22.736484: W tensorflow/core/kernels/queue_base.cc:277] _2_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed 2021-09-07 07:24:22.736506: W tensorflow/core/kernels/queue_base.cc:277] _2_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed 2021-09-07 07:24:22.736524: W tensorflow/core/kernels/queue_base.cc:277] _2_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed 2021-09-07 07:24:22.736562: W tensorflow/core/kernels/queue_base.cc:277] _0_get_batch/input_producer: Skipping cancelled enqueue attempt with queue not closed 2021-09-07 07:24:22.736594: W tensorflow/core/kernels/queue_base.cc:277] _2_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed 2021-09-07 07:24:22.736615: W tensorflow/core/kernels/queue_base.cc:277] _2_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed 2021-09-07 07:24:22.736637: W tensorflow/core/kernels/queue_base.cc:277] _2_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed 2021-09-07 07:24:22.736655: W tensorflow/core/kernels/queue_base.cc:277] _2_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed 2021-09-07 07:24:22.736670: W tensorflow/core/kernels/queue_base.cc:277] _2_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed 2021-09-07 07:24:22.736684: W tensorflow/core/kernels/queue_base.cc:277] _2_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed 2021-09-07 07:24:22.736699: W tensorflow/core/kernels/queue_base.cc:277] _2_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed 2021-09-07 07:24:22.736714: W tensorflow/core/kernels/queue_base.cc:277] _2_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed 2021-09-07 07:24:22.736730: W tensorflow/core/kernels/queue_base.cc:277] _2_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed 2021-09-07 07:24:22.736762: W tensorflow/core/kernels/queue_base.cc:277] _2_get_batch/batch/padding_fifo_queue: Skipping cancelled enqueue attempt with queue not closed Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call return fn(*args) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn target_list, run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InvalidArgumentError: Input matrix is not invertible. [[{{node tower_0/gradients/tower_0/build_loss/MatrixSquareRoot_grad/MatrixSolve}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "train.py", line 160, in trainer.main() File "train.py", line 155, in main self.log_printer(gwd, optimizer, global_step, tower_grads, total_loss_dict, num_gpu, graph) File "../../tools/train_base.py", line 216, in logprinter , global_stepnp = sess.run([train_op, global_step]) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 956, in run run_metadata_ptr) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1180, in _run feed_dict_tensor, options, run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: Input matrix is not invertible. [[node tower_0/gradients/tower_0/build_loss/MatrixSquareRoot_grad/MatrixSolve (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]

Original stack trace for 'tower_0/gradients/tower_0/build_loss/MatrixSquareRoot_grad/MatrixSolve': File "train.py", line 160, in trainer.main() File "train.py", line 151, in main grads = optimizer.compute_gradients(total_losses) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/optimizer.py", line 537, in compute_gradients colocate_gradients_with_ops=colocate_gradients_with_ops) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gradients_impl.py", line 158, in gradients unconnected_gradients) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gradients_util.py", line 703, in _GradientsHelper lambda: grad_fn(op, out_grads)) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gradients_util.py", line 362, in _MaybeCompile return grad_fn() # Exit early File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gradients_util.py", line 703, in lambda: grad_fn(op, out_grads)) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/linalg_grad.py", line 117, in _MatrixSquareRootGrad vec_dsqrtm = linalg_ops.matrix_solve(ksum, vec_da) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gen_linalg_ops.py", line 1672, in matrix_solve "MatrixSolve", matrix=matrix, rhs=rhs, adjoint=adjoint, name=name) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/deprecation.py", line 513, in new_func return func(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op attrs, op_def, compute_device) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 1748, in init self._traceback = tf_stack.extract_stack()

...which was originally created as op 'tower_0/build_loss/MatrixSquareRoot', defined at: File "train.py", line 160, in trainer.main() File "train.py", line 116, in main gpu_id=i) File "../../libs/models/detectors/gwd/build_whole_network.py", line 73, in build_whole_detection_network func=self.cfgs.GWD_FUNC) File "../../libs/models/losses/losses_gwd.py", line 91, in wasserstein_distance_loss wasserstein_distance_item2 = tf.reshape(self.wasserstein_distance_sigma(sigma1, sigma2), [-1, 1]) File "../../libs/models/losses/losses_gwd.py", line 24, in wasserstein_distance_sigma tf.linalg.matmul(tf.linalg.matmul(sigma1, tf.linalg.matmul(sigma2, sigma2)), sigma1)) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gen_linalg_ops.py", line 1888, in matrix_square_root "MatrixSquareRoot", input=input, name=name) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/deprecation.py", line 513, in new_func return func(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op attrs, op_def, compute_device) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 1748, in init self._traceback = tf_stack.extract_stack()

I can not get the answer by Google , do you give me some advise?

yangxue0827 commented 3 years ago

tensorflow.python.framework.errors_impl.InvalidArgumentError: Input matrix is not invertible.

Adjustment parameters, such as GWD_FUNC, GWD_TAU, LR.

myGithubSiki commented 3 years ago

Ok, thanks I will try . Thanks for your reply and help again