TypeError: Fetch argument None of None has invalid type <type 'NoneType'>, must be a string or Tensor.

MartinThoma commented 8 years ago

When I try to train I still get an error (python test_fcn32_vgg.py works fine, though):

python train.py
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally
2016-07-01 17:20:06,111 INFO No environment variable 'TV_PLUGIN_DIR' found. Set to '/home/moose/tv-plugins'.
2016-07-01 17:20:06,111 INFO No environment variable 'TV_STEP_SHOW' found. Set to '50'.
2016-07-01 17:20:06,111 INFO No environment variable 'TV_STEP_EVAL' found. Set to '250'.
2016-07-01 17:20:06,111 INFO No environment variable 'TV_STEP_WRITE' found. Set to '1000'.
2016-07-01 17:20:06,111 INFO No environment variable 'TV_MAX_KEEP' found. Set to '10'.
2016-07-01 17:20:06,111 INFO No environment variable 'TV_STEP_STR' found. Set to 'Step {step}/{total_steps}: loss = {loss_value:.2f} ( {sec_per_batch:.3f} sec (per Batch); {examples_per_sec:.1f} examples/sec)'.
2016-07-01 17:20:06,111 INFO f: <open file 'hypes/medseg.json', mode 'r' at 0x7f4c75c018a0>
2016-07-01 17:20:06,112 INFO Initialize training folder
2016-07-01 17:20:06,113 INFO Start training
/home/moose/GitHub/private/MediSeg/AP5/tensorflow_fcn/vgg16.npy
npy file loaded
Layer name: conv1_1
Layer shape: (3, 3, 3, 64)
INFO:tensorflow:Created variable conv1_1/filter:0 with shape (3, 3, 3, 64) and init <function _initializer at 0x7f4c6d4d6cf8>
2016-07-01 17:20:06,363 INFO Created variable conv1_1/filter:0 with shape (3, 3, 3, 64) and init <function _initializer at 0x7f4c6d4d6cf8>
INFO:tensorflow:Created variable conv1_1/biases:0 with shape (64,) and init <function _initializer at 0x7f4c6d4d6cf8>
2016-07-01 17:20:06,366 INFO Created variable conv1_1/biases:0 with shape (64,) and init <function _initializer at 0x7f4c6d4d6cf8>
Layer name: conv1_2
Layer shape: (3, 3, 64, 64)
INFO:tensorflow:Created variable conv1_2/filter:0 with shape (3, 3, 64, 64) and init <function _initializer at 0x7f4c6d4e8a28>
2016-07-01 17:20:06,372 INFO Created variable conv1_2/filter:0 with shape (3, 3, 64, 64) and init <function _initializer at 0x7f4c6d4e8a28>
INFO:tensorflow:Created variable conv1_2/biases:0 with shape (64,) and init <function _initializer at 0x7f4c6d4e8a28>
2016-07-01 17:20:06,375 INFO Created variable conv1_2/biases:0 with shape (64,) and init <function _initializer at 0x7f4c6d4e8a28>
Layer name: conv2_1
Layer shape: (3, 3, 64, 128)
INFO:tensorflow:Created variable conv2_1/filter:0 with shape (3, 3, 64, 128) and init <function _initializer at 0x7f4c6d4c3b90>
2016-07-01 17:20:06,381 INFO Created variable conv2_1/filter:0 with shape (3, 3, 64, 128) and init <function _initializer at 0x7f4c6d4c3b90>
INFO:tensorflow:Created variable conv2_1/biases:0 with shape (128,) and init <function _initializer at 0x7f4c6c1e4c08>
2016-07-01 17:20:06,384 INFO Created variable conv2_1/biases:0 with shape (128,) and init <function _initializer at 0x7f4c6c1e4c08>
Layer name: conv2_2
Layer shape: (3, 3, 128, 128)
INFO:tensorflow:Created variable conv2_2/filter:0 with shape (3, 3, 128, 128) and init <function _initializer at 0x7f4c6c24ff50>
2016-07-01 17:20:06,390 INFO Created variable conv2_2/filter:0 with shape (3, 3, 128, 128) and init <function _initializer at 0x7f4c6c24ff50>
INFO:tensorflow:Created variable conv2_2/biases:0 with shape (128,) and init <function _initializer at 0x7f4c6c1a9398>
2016-07-01 17:20:06,393 INFO Created variable conv2_2/biases:0 with shape (128,) and init <function _initializer at 0x7f4c6c1a9398>
Layer name: conv3_1
Layer shape: (3, 3, 128, 256)
INFO:tensorflow:Created variable conv3_1/filter:0 with shape (3, 3, 128, 256) and init <function _initializer at 0x7f4c6c217c08>
2016-07-01 17:20:06,400 INFO Created variable conv3_1/filter:0 with shape (3, 3, 128, 256) and init <function _initializer at 0x7f4c6c217c08>
INFO:tensorflow:Created variable conv3_1/biases:0 with shape (256,) and init <function _initializer at 0x7f4c6c217c08>
2016-07-01 17:20:06,403 INFO Created variable conv3_1/biases:0 with shape (256,) and init <function _initializer at 0x7f4c6c217c08>
Layer name: conv3_2
Layer shape: (3, 3, 256, 256)
INFO:tensorflow:Created variable conv3_2/filter:0 with shape (3, 3, 256, 256) and init <function _initializer at 0x7f4c6c29bf50>
2016-07-01 17:20:06,410 INFO Created variable conv3_2/filter:0 with shape (3, 3, 256, 256) and init <function _initializer at 0x7f4c6c29bf50>
INFO:tensorflow:Created variable conv3_2/biases:0 with shape (256,) and init <function _initializer at 0x7f4c6c1a2668>
2016-07-01 17:20:06,413 INFO Created variable conv3_2/biases:0 with shape (256,) and init <function _initializer at 0x7f4c6c1a2668>
Layer name: conv3_3
Layer shape: (3, 3, 256, 256)
INFO:tensorflow:Created variable conv3_3/filter:0 with shape (3, 3, 256, 256) and init <function _initializer at 0x7f4c6c190ed8>
2016-07-01 17:20:06,420 INFO Created variable conv3_3/filter:0 with shape (3, 3, 256, 256) and init <function _initializer at 0x7f4c6c190ed8>
INFO:tensorflow:Created variable conv3_3/biases:0 with shape (256,) and init <function _initializer at 0x7f4c6c190ed8>
2016-07-01 17:20:06,423 INFO Created variable conv3_3/biases:0 with shape (256,) and init <function _initializer at 0x7f4c6c190ed8>
Layer name: conv4_1
Layer shape: (3, 3, 256, 512)
INFO:tensorflow:Created variable conv4_1/filter:0 with shape (3, 3, 256, 512) and init <function _initializer at 0x7f4c6c1dda28>
2016-07-01 17:20:06,432 INFO Created variable conv4_1/filter:0 with shape (3, 3, 256, 512) and init <function _initializer at 0x7f4c6c1dda28>
INFO:tensorflow:Created variable conv4_1/biases:0 with shape (512,) and init <function _initializer at 0x7f4c6c11c9b0>
2016-07-01 17:20:06,435 INFO Created variable conv4_1/biases:0 with shape (512,) and init <function _initializer at 0x7f4c6c11c9b0>
Layer name: conv4_2
Layer shape: (3, 3, 512, 512)
INFO:tensorflow:Created variable conv4_2/filter:0 with shape (3, 3, 512, 512) and init <function _initializer at 0x7f4c6c0c0f50>
2016-07-01 17:20:06,445 INFO Created variable conv4_2/filter:0 with shape (3, 3, 512, 512) and init <function _initializer at 0x7f4c6c0c0f50>
INFO:tensorflow:Created variable conv4_2/biases:0 with shape (512,) and init <function _initializer at 0x7f4c6c0c0f50>
2016-07-01 17:20:06,448 INFO Created variable conv4_2/biases:0 with shape (512,) and init <function _initializer at 0x7f4c6c0c0f50>
Layer name: conv4_3
Layer shape: (3, 3, 512, 512)
INFO:tensorflow:Created variable conv4_3/filter:0 with shape (3, 3, 512, 512) and init <function _initializer at 0x7f4c6c0d28c0>
2016-07-01 17:20:06,458 INFO Created variable conv4_3/filter:0 with shape (3, 3, 512, 512) and init <function _initializer at 0x7f4c6c0d28c0>
INFO:tensorflow:Created variable conv4_3/biases:0 with shape (512,) and init <function _initializer at 0x7f4c6c09ad70>
2016-07-01 17:20:06,461 INFO Created variable conv4_3/biases:0 with shape (512,) and init <function _initializer at 0x7f4c6c09ad70>
Layer name: conv5_1
Layer shape: (3, 3, 512, 512)
INFO:tensorflow:Created variable conv5_1/filter:0 with shape (3, 3, 512, 512) and init <function _initializer at 0x7f4c6c0c0f50>
2016-07-01 17:20:06,472 INFO Created variable conv5_1/filter:0 with shape (3, 3, 512, 512) and init <function _initializer at 0x7f4c6c0c0f50>
INFO:tensorflow:Created variable conv5_1/biases:0 with shape (512,) and init <function _initializer at 0x7f4c6c0c0f50>
2016-07-01 17:20:06,475 INFO Created variable conv5_1/biases:0 with shape (512,) and init <function _initializer at 0x7f4c6c0c0f50>
Layer name: conv5_2
Layer shape: (3, 3, 512, 512)
INFO:tensorflow:Created variable conv5_2/filter:0 with shape (3, 3, 512, 512) and init <function _initializer at 0x7f4c6c11c9b0>
2016-07-01 17:20:06,483 INFO Created variable conv5_2/filter:0 with shape (3, 3, 512, 512) and init <function _initializer at 0x7f4c6c11c9b0>
INFO:tensorflow:Created variable conv5_2/biases:0 with shape (512,) and init <function _initializer at 0x7f4c6c11c9b0>
2016-07-01 17:20:06,486 INFO Created variable conv5_2/biases:0 with shape (512,) and init <function _initializer at 0x7f4c6c11c9b0>
Layer name: conv5_3
Layer shape: (3, 3, 512, 512)
INFO:tensorflow:Created variable conv5_3/filter:0 with shape (3, 3, 512, 512) and init <function _initializer at 0x7f4c6bf07f50>
2016-07-01 17:20:06,495 INFO Created variable conv5_3/filter:0 with shape (3, 3, 512, 512) and init <function _initializer at 0x7f4c6bf07f50>
INFO:tensorflow:Created variable conv5_3/biases:0 with shape (512,) and init <function _initializer at 0x7f4c6be93398>
2016-07-01 17:20:06,498 INFO Created variable conv5_3/biases:0 with shape (512,) and init <function _initializer at 0x7f4c6be93398>
Layer name: fc6
Layer shape: [7, 7, 512, 4096]
INFO:tensorflow:Created variable fc6/weights:0 with shape (7, 7, 512, 4096) and init <function _initializer at 0x7f4c6be83c08>
2016-07-01 17:20:06,658 INFO Created variable fc6/weights:0 with shape (7, 7, 512, 4096) and init <function _initializer at 0x7f4c6be83c08>
INFO:tensorflow:Created variable fc6/biases:0 with shape (4096,) and init <function _initializer at 0x7f4c6be83c08>
2016-07-01 17:20:06,660 INFO Created variable fc6/biases:0 with shape (4096,) and init <function _initializer at 0x7f4c6be83c08>
Layer name: fc7
Layer shape: [1, 1, 4096, 4096]
INFO:tensorflow:Created variable fc7/weights:0 with shape (1, 1, 4096, 4096) and init <function _initializer at 0x7f4c6bde8e60>
2016-07-01 17:20:06,697 INFO Created variable fc7/weights:0 with shape (1, 1, 4096, 4096) and init <function _initializer at 0x7f4c6bde8e60>
INFO:tensorflow:Created variable fc7/biases:0 with shape (4096,) and init <function _initializer at 0x7f4c6bde8e60>
2016-07-01 17:20:06,699 INFO Created variable fc7/biases:0 with shape (4096,) and init <function _initializer at 0x7f4c6bde8e60>
INFO:tensorflow:Created variable score_fr/weights:0 with shape (1, 1, 4096, 2) and init <function _initializer at 0x7f4c6be83c08>
2016-07-01 17:20:06,710 INFO Created variable score_fr/weights:0 with shape (1, 1, 4096, 2) and init <function _initializer at 0x7f4c6be83c08>
INFO:tensorflow:Created variable score_fr/biases:0 with shape (2,) and init <function _initializer at 0x7f4c6bd79f50>
2016-07-01 17:20:06,712 INFO Created variable score_fr/biases:0 with shape (2,) and init <function _initializer at 0x7f4c6bd79f50>
INFO:tensorflow:Created variable up/up_filter:0 with shape (64, 64, 2, 2) and init <function _initializer at 0x7f4c6be83c08>
2016-07-01 17:20:06,725 INFO Created variable up/up_filter:0 with shape (64, 64, 2, 2) and init <function _initializer at 0x7f4c6be83c08>
/home/moose/GitHub/private/MediSeg/AP5/tensorflow_fcn/vgg16.npy
npy file loaded
Layer name: conv1_1
Layer shape: (3, 3, 3, 64)
Layer name: conv1_2
Layer shape: (3, 3, 64, 64)
Layer name: conv2_1
Layer shape: (3, 3, 64, 128)
Layer name: conv2_2
Layer shape: (3, 3, 128, 128)
Layer name: conv3_1
Layer shape: (3, 3, 128, 256)
Layer name: conv3_2
Layer shape: (3, 3, 256, 256)
Layer name: conv3_3
Layer shape: (3, 3, 256, 256)
Layer name: conv4_1
Layer shape: (3, 3, 256, 512)
Layer name: conv4_2
Layer shape: (3, 3, 512, 512)
Layer name: conv4_3
Layer shape: (3, 3, 512, 512)
Layer name: conv5_1
Layer shape: (3, 3, 512, 512)
Layer name: conv5_2
Layer shape: (3, 3, 512, 512)
Layer name: conv5_3
Layer shape: (3, 3, 512, 512)
Layer name: fc6
Layer shape: [7, 7, 512, 4096]
Layer name: fc7
Layer shape: [1, 1, 4096, 4096]
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:900] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties: 
name: GeForce GTX TITAN Black
major: 3 minor: 5 memoryClockRate (GHz) 0.98
pciBusID 0000:01:00.0
Total memory: 6.00GiB
Free memory: 5.88GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:755] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN Black, pci bus id: 0000:01:00.0)
2016-07-01 17:20:13,498 INFO Step 0/10000: loss = 1.46 ( 0.045 sec (per Batch); 22.1 examples/sec)
2016-07-01 17:21:00,679 INFO Step 50/10000: loss = 0.62 ( 0.922 sec (per Batch); 1.1 examples/sec)
2016-07-01 17:21:46,826 INFO Step 100/10000: loss = 0.59 ( 0.907 sec (per Batch); 1.1 examples/sec)
2016-07-01 17:22:32,028 INFO Step 150/10000: loss = 0.59 ( 0.888 sec (per Batch); 1.1 examples/sec)
2016-07-01 17:23:20,413 INFO Step 200/10000: loss = 0.49 ( 0.951 sec (per Batch); 1.1 examples/sec)
2016-07-01 17:24:07,795 INFO Doing Evaluate with Training Data.
2016-07-01 17:24:07,795 WARNING Passing eval_op directly is deprecated. Pass a list of tuples instead.
2016-07-01 17:24:07,795 INFO Data: train  Num examples:  10000 
Traceback (most recent call last):
  File "train.py", line 76, in <module>
    tf.app.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 30, in run
    sys.exit(main(sys.argv))
  File "train.py", line 72, in main
    train.do_training(hypes)
  File "/home/moose/GitHub/TensorVision/tensorvision/train.py", line 339, in do_training
    graph_ops, sess_coll)
  File "/home/moose/GitHub/TensorVision/tensorvision/train.py", line 286, in run_training_step
    _do_evaluation(hypes, step, sess_coll, eval_dict)
  File "/home/moose/GitHub/TensorVision/tensorvision/train.py", line 255, in _do_evaluation
    sess=sess)
  File "/home/moose/GitHub/TensorVision/tensorvision/core.py", line 209, in do_eval
    results = sess.run(eval_op)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 340, in run
    run_metadata_ptr)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 523, in _run
    processed_fetches = self._process_fetches(fetches)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 493, in _process_fetches
    % (subfetch, fetch, type(subfetch), str(e)))
TypeError: Fetch argument None of None has invalid type <type 'NoneType'>, must be a string or Tensor. (Can not convert a NoneType into a Tensor or Operation.)

MartinThoma commented 8 years ago

I think it might be this problem: http://stackoverflow.com/a/33775312/562769

MartinThoma commented 8 years ago

build_training_graph gibt

({'train': <tensorflow.python.ops.data_flow_ops.FIFOQueue object at 0x7fdaa966b810>,
  'val': <tensorflow.python.ops.data_flow_ops.FIFOQueue object at 0x7fdaa795c810>},  
 <tensorflow.python.framework.ops.Operation object at 0x7fdaa88b9f90>,
 <tf.Tensor 'loss/total_loss:0' shape=() dtype=float32>,
 {'train': None, 'val': None})

zurück

MartinThoma commented 8 years ago

AP5/model/objective.py: evaluation gibt immer None zurück.

MarvinTeichmann commented 8 years ago

Are you using TensorVision @ fee9114? Commit fee9114 contains a patch to handle the case of None as return value for evaluation. I have not merged this patch/feature into main TensorVision.

Alternatively you can remove line: AP5/model/objective.py:81. This will slow the training process but increase the evaluation done. (Which might be beneficial in your case). I am not able to test the code here. But both solutions should work.

MartinThoma commented 8 years ago

I dowloaded fee9114 as a zip (because checking it out was not possible). Now I get:

2016-07-05 16:41:57,531 INFO Step 0/1000: loss = 1.10 ( 0.095 sec (per Batch); 10.5 examples/sec)
2016-07-05 16:44:05,930 INFO Step 50/1000: loss = 1.10 ( 2.509 sec (per Batch); 0.4 examples/sec)
2016-07-05 16:45:57,487 INFO Step 100/1000: loss = 0.51 ( 2.190 sec (per Batch); 0.5 examples/sec)
2016-07-05 16:47:33,242 INFO Step 150/1000: loss = 0.52 ( 1.878 sec (per Batch); 0.5 examples/sec)
2016-07-05 16:49:07,895 INFO Step 200/1000: loss = 0.76 ( 1.856 sec (per Batch); 0.5 examples/sec)
2016-07-05 16:50:35,459 INFO Doing Evaluate with Training Data.
2016-07-05 16:50:35,460 INFO Doing Evaluation with Testing Data.
2016-07-05 16:50:35,460 INFO Doing Python Evaluation.
Traceback (most recent call last):
  File "train.py", line 76, in <module>
    tf.app.run()
  File "/home/moose/.local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 30, in run
    sys.exit(main(sys.argv))
  File "train.py", line 72, in main
    train.do_training(hypes)
  File "/home/moose/GitHub/private/MediSeg/AP5/tensorvision/train.py", line 393, in do_training
    image_pl, softmax)
  File "/home/moose/GitHub/private/MediSeg/AP5/tensorvision/train.py", line 326, in run_training_step
    image_pl, softmax)
  File "/home/moose/GitHub/private/MediSeg/AP5/tensorvision/train.py", line 294, in _do_python_evaluation
    eval_dict, images = objective.tensor_eval(hypes, sess, image_pl, softmax)
  File "/home/moose/GitHub/private/MediSeg/AP5/hypes/../model/objective.py", line 155, in tensor_eval
    FN, FP, posNum, negNum = eval_image(hypes, gt_image, output_im)
  File "/home/moose/GitHub/private/MediSeg/AP5/hypes/../model/objective.py", line 115, in eval_image
    validArea=None)
  File "/home/moose/GitHub/private/MediSeg/AP5/utils/kitti_devkit/seg_utils.py", line 53, in evalExp
    assert len(gtBin.shape) == 2, 'Wrong size of input prob map'
AssertionError: Wrong size of input prob map

MartinThoma commented 8 years ago

gtBin.shape is (480, 640, 4) in that case.

MartinThoma commented 8 years ago

Ok, I think I might have found the problem. In tensor_eval you should flatten the image:

gt_image = scp.misc.imread(gt_file, flatten=True)

However, this makes it obvious that I should work on https://github.com/TensorVision/TensorVision/issues/42 so that we can have a much cleaner approach :-)

edit: Yes, it works :-)

MarvinTeichmann commented 8 years ago

Ok, turned out that fixing this was as easy as updating the submodule TensorVision to head of python_eval. @MartinThoma In my runs gt_image always has shape (480, 640), i.e.gt_image is loaded as greyscale image. Do you have a different/older scipy version? And why should https://github.com/TensorVision/TensorVision/issues/42 fix this? Are you planning to implement checks, testing whether gt is always loaded as greyscale?

MartinThoma commented 8 years ago

Ah, the difference might be that I edited the labels to be white / black instead of those three grayscale values (2 for instruments). That changed the color mode to RGB.

Do you have a different/older scipy version?

Unlikely. My scipy.__version__ is 0.17.1.

And why should TensorVision/TensorVision#42 fix this? Are you planning to implement checks, testing whether gt is always loaded as greyscale?

Quite the opposite ;-) I make sure that it is always loaded as RGB. But with the classes attribute in hypes the TensorVision code makes sure the numpy array elements are class indices, not colors. So the user does not have to deal with this stuff.

MarvinTeichmann / MediSeg

TypeError: Fetch argument None of None has invalid type <type 'NoneType'>, must be a string or Tensor. #2