Joker316701882 / Salient-Object-Detection

This is tensorflow implementation for cvpr2017 paper "Deeply Supervised Salient Object Detection with Short Connections"
443 stars 139 forks source link

Unable to test on single image #1

Open Zumbalamambo opened 6 years ago

Zumbalamambo commented 6 years ago

I tried to run on top of an image as python inference.py --rgb=Chris.jpg...

im getting the following error,

Traceback (most recent call last):
  File "/Users/anaconda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1139, in _do_call
    return fn(*args)
  File "/Users/anaconda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1117, in _run_fn
    self._extend_graph()
  File "/Users/anaconda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1166, in _extend_graph
    self._session, graph_def.SerializeToString(), status)
  File "/Users/anaconda/lib/python3.6/contextlib.py", line 89, in __exit__
    next(self.gen)
  File "/Users/anaconda/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op 'MaxPoolWithArgmax' with these attrs.  Registered devices: [CPU], Registered kernels:
  <no registered kernels>

     [[Node: pool1 = MaxPoolWithArgmax[T=DT_FLOAT, Targmax=DT_INT64, ksize=[1, 2, 2, 1], padding="SAME", strides=[1, 2, 2, 1]](conv1_2)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "inference.py", line 65, in <module>
    main(parse_arguments(sys.argv[1:]))
  File "inference.py", line 22, in main
    saver.restore(sess,tf.train.latest_checkpoint('./salience_model'))
  File "/Users/anaconda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1548, in restore
    {self.saver_def.filename_tensor_name: save_path})
  File "/Users/anaconda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 789, in run
    run_metadata_ptr)
  File "/Users/anaconda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 997, in _run
    feed_dict_string, options, run_metadata)
  File "/Users/anaconda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1132, in _do_run
    target_list, options, run_metadata)
  File "/Users/anaconda/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op 'MaxPoolWithArgmax' with these attrs.  Registered devices: [CPU], Registered kernels:
  <no registered kernels>

     [[Node: pool1 = MaxPoolWithArgmax[T=DT_FLOAT, Targmax=DT_INT64, ksize=[1, 2, 2, 1], padding="SAME", strides=[1, 2, 2, 1]](conv1_2)]]

Caused by op 'pool1', defined at:
  File "inference.py", line 65, in <module>
    main(parse_arguments(sys.argv[1:]))
  File "inference.py", line 21, in main
    saver = tf.train.import_meta_graph('./meta_graph/my-model.meta')
  File "/Users/anaconda/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1686, in import_meta_graph
    **kwargs)
  File "/Users/anaconda/lib/python3.6/site-packages/tensorflow/python/framework/meta_graph.py", line 504, in import_scoped_meta_graph
    producer_op_list=producer_op_list)
  File "/Users/anaconda/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 311, in import_graph_def
    op_def=op_def)
  File "/Users/anaconda/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/Users/anaconda/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1269, in __init__
    self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): No OpKernel was registered to support Op 'MaxPoolWithArgmax' with these attrs.  Registered devices: [CPU], Registered kernels:
  <no registered kernels>

     [[Node: pool1 = MaxPoolWithArgmax[T=DT_FLOAT, Targmax=DT_INT64, ksize=[1, 2, 2, 1], padding="SAME", strides=[1, 2, 2, 1]](conv1_2)]]
Joker316701882 commented 6 years ago

I'm not sure what happened here. My dependencies: tensorflow 1.0.0 python 3.5.2

Zumbalamambo commented 6 years ago

Are you inferencing in gpu or cpu?

Joker316701882 commented 6 years ago

GPU. This model was trained on Titan X and it can run on GTX 940.

saeed68gm commented 6 years ago

I read that this issue is caused because tf.nn.max_pool_with_argmax is only supported on GPU. @Joker316701882 if you post your code for the graph, I can fix the issue

saeed68gm commented 6 years ago

Looks like some issue with the new tensorflow commits of Conv2D definition as well. They change the property name of stride to "dilations" and it's causing some errors because the code is old. Is there any chance you can publish your code or change it so we can try running it?

DiddyC commented 6 years ago

I ran into the same issue, and tried eliminating the GPU dependencies via 'clear_devices=True'. Didn't work. @saeed68gm - did you find any other solution? I'm running on Python 2.7 and TF 1.4

LRLLRL commented 6 years ago

I have same issue,could you tell me how to solve?

saeed68gm commented 6 years ago

@LRLLRL @YedidyahD I did some more research and it looks like the operation MaxPoolWithArgmax is missing from tensorflow cpu operations. So my suggestion is that you modify the source code of tensorflow to add this op. I have posted a patch for a work around on this thread: https://github.com/tensorflow/tensorflow/issues/6035

it should work on tf 1.4 source code.

Let me know if you can get it to work.

b10112157 commented 6 years ago

i have same promblem. i used win10 python 3.5.4 gtx1060 3g

but command python--rgb = animal.jpg
show oom

liutianling commented 6 years ago

By installing tensorflow-gpu 1.0.0, it seems solved the problem of @Zumbalamambo . But i met another problem of memory. Did it need so large memory of 10.9GB just test an image?

I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties: name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate (GHz) 1.683 pciBusID 0000:65:00.0 Total memory: 10.91GiB Free memory: 10.49GiB I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:65:00.0) E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 10.91G (11713708032 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY F tensorflow/stream_executor/cuda/cuda_dnn.cc:222] Check failed: s.ok() could not find cudnnCreate in cudnn DSO; dlerror: /home/usr/.virtualenvs/salient_detection/lib/python3.6/site-packages/tensorflow/python/_pywrap_tensorflow.so: undefined symbol: cudnnCreate Aborted (core dumped)

LRLLRL commented 6 years ago

Sorry,I can not solve this problem.

------------------ Original ------------------ From: "liutianling";notifications@github.com; Send time: Friday, May 4, 2018 11:04 AM To: "Joker316701882/Salient-Object-Detection"Salient-Object-Detection@noreply.github.com; Cc: "ぃ "530764151@qq.com; "Mention"mention@noreply.github.com; Subject: Re: [Joker316701882/Salient-Object-Detection] Unable to test onsingle image (#1)

@Zumbalamambo Did you solve you problem?I met the same problem.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

CindyHXH commented 6 years ago

@liutianling You can fix it by changing it to gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.333) in the inference.py