How to training... - Githubissues

gnakagami commented 6 years ago

I'm interested your project.

I would like to try this. May I teach me how to use...

I ready sample images (size is 768x1024). ./train ... training images ./test ... test images
labeling lebeling by label_ui.py and materialise_label_db.py ./label ... label train and test images
training <-- Occur Error I tried to traning by train.py, but it occurred error.

$ python train.py --train-image-dir ./bee/data/train --test-image-dir ./bee/data/test --label-dir ./bee/data/label --run ./bee/bnn/ckpts opts Namespace(base_filter_size=8, batch_size=32, flip_left_right=False, label_dir='./bee/data/label', learning_rate=0.001, no_use_batch_norm=False, no_use_skip_connections=False, patch_fraction=2, random_rotate=False, run='./bee/bnn/ckpts', secs=None, steps=100000, test_image_dir='.bee/data/test', train_image_dir='./bee/data/train', train_steps=100) len(rgb_filenames) 8 CACHE len(rgb_filenames) 2 CACHE (?, 1024, 768, 3) (?, 512, 384, 1) patch train model... input (?, 512, 384, 3) #589824 e1 (?, 256, 192, 8) #393216 e2 (?, 128, 96, 16) #196608 e3 (?, 64, 48, 32) #98304 e4 (?, 32, 24, 64) #49152 d1 (?, 64, 48, 32) #98304 d1+e3 (?, 64, 48, 64) #196608 d2 (?, 128, 96, 16) #196608 d2+e2 (?, 128, 96, 32) #393216 d3 (?, 256, 192, 8) #393216 d3+e1 (?, 256, 192, 16) #786432 logits (?, 256, 192, 1) #49152 full res test model... input (?, 1024, 768, 3) #2359296 e1 (?, 512, 384, 8) #1572864 e2 (?, 256, 192, 16) #786432 e3 (?, 128, 96, 32) #393216 e4 (?, 64, 48, 64) #196608 d1 (?, 128, 96, 32) #393216 d1+e3 (?, 128, 96, 64) #786432 d2 (?, 256, 192, 16) #786432 d2+e2 (?, 256, 192, 32) #1572864 d3 (?, 512, 384, 8) #1572864 d3+e1 (?, 512, 384, 16) #3145728 logits (?, 512, 384, 1) #196608 Traceback (most recent call last): File "/home/.pyenv/versions/anaconda3-5.2.0/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1322, in _do_call return fn(*args) File "/home/.pyenv/versions/anaconda3-5.2.0/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1307, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/home/.pyenv/versions/anaconda3-5.2.0/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 3145728 values, but the requested shape has 2359296 [[Node: Reshape = Reshape[T=DT_UINT8, Tshape=DT_INT32](decode_image/cond_jpeg/Merge, Reshape/shape)]] [[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,512,384,3], [?,256,192,1]], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "train.py", line 101, in _, xl, dl = sess.run([train_op, train_model.xent_loss, train_model.dice_loss]) File "/home/.pyenv/versions/anaconda3-5.2.0/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run run_metadata_ptr) File "/home/.pyenv/versions/anaconda3-5.2.0/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1135, in _run feed_dict_tensor, options, run_metadata) File "/home/.pyenv/versions/anaconda3-5.2.0/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run run_metadata) File "/home/.pyenv/versions/anaconda3-5.2.0/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 3145728 values, but the requested shape has 2359296 [[Node: Reshape = Reshape[T=DT_UINT8, Tshape=DT_INT32](decode_image/cond_jpeg/Merge, Reshape/shape)]] [[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,512,384,3], [?,256,192,1]], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

jamesthesken commented 6 years ago

Make sure to run data.py I also resized the images using rgb = tf.image.resize_images(rgb, [1024, 768]) within the function decode_images

JoeRu commented 6 years ago

Hi i run into the same initial situation. So - when i try to start data.py i got nearly the same error.

i really have a hard time to get started.

so just to have things wright: 1) start with data.py? 2) go with labels? 3) then train.py?

the comment-ratio in the coding is also a bit not-present - for me at least.

`➜ bnn git:(master) ✗ python data.py --image-dir ./datas/train --label-dir ./datas/labels
Namespace(batch_size=4, distort=False, image_dir='./datas/train', label_dir='./datas/labels', patch_fraction=1) 2018-07-07 22:31:22.954142: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA len(rgb_filenames) 1128 NO CACHE

batch 0 2018-07-07 22:31:25.008307: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at whole_file_read_ops.cc:114 : Not found: ./datas/labels/img20180628130137.png; No such file or directory 2018-07-07 22:31:25.014733: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at whole_file_read_ops.cc:114 : Not found: ./datas/labels/img20180628125652.png; No such file or directory 2018-07-07 22:31:25.017984: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at whole_file_read_ops.cc:114 : Not found: ./datas/labels/img20180628150222.png; No such file or directory 2018-07-07 22:31:25.018808: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at whole_file_read_ops.cc:114 : Not found: ./datas/labels/img20180628170908.png; No such file or directory 2018-07-07 22:31:25.019328: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at whole_file_read_ops.cc:114 : Not found: ./datas/labels/img20180628154437.png; No such file or directory 2018-07-07 22:31:25.049134: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at whole_file_read_ops.cc:114 : Not found: ./datas/labels/img20180628122145.png; No such file or directory 2018-07-07 22:31:25.060085: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at whole_file_read_ops.cc:114 : Not found: ./datas/labels/img20180628142828.png; No such file or directory 2018-07-07 22:31:25.065693: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at whole_file_read_ops.cc:114 : Not found: ./datas/labels/img20180628120020.png; No such file or directory 2018-07-07 22:31:25.079618: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at whole_file_read_ops.cc:114 : Not found: ./datas/labels/img20180628101423.png; No such file or directory 2018-07-07 22:31:25.092111: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at whole_file_read_ops.cc:114 : Not found: ./datas/labels/img20180628163401.png; No such file or directory 2018-07-07 22:31:25.100993: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at whole_file_read_ops.cc:114 : Not found: ./datas/labels/img20180628181624.png; No such file or directory 2018-07-07 22:31:25.101877: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at whole_file_read_ops.cc:114 : Not found: ./datas/labels/img20180628170012.png; No such file or directory 2018-07-07 22:31:25.116720: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at whole_file_read_ops.cc:114 : Not found: ./datas/labels/img20180628101311.png; No such file or directory 2018-07-07 22:31:25.130468: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at whole_file_read_ops.cc:114 : Not found: ./datas/labels/img20180628084617.png; No such file or directory 2018-07-07 22:31:25.140526: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at whole_file_read_ops.cc:114 : Not found: ./datas/labels/img20180628111056.png; No such file or directory 2018-07-07 22:31:25.142666: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at whole_file_read_ops.cc:114 : Not found: ./datas/labels/img20180628153319.png; No such file or directory 2018-07-07 22:31:25.142666: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at whole_file_read_ops.cc:114 : Not found: ./datas/labels/img20180628114825.png; No such file or directory 2018-07-07 22:31:25.171338: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at whole_file_read_ops.cc:114 : Not found: ./datas/labels/img20180628190324.png; No such file or directory 2018-07-07 22:31:25.178863: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at whole_file_read_ops.cc:114 : Not found: ./datas/labels/img20180628152049.png; No such file or directory 2018-07-07 22:31:25.180937: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at whole_file_read_ops.cc:114 : Not found: ./datas/labels/img20180628082304.png; No such file or directory 2018-07-07 22:31:25.183695: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at whole_file_read_ops.cc:114 : Not found: ./datas/labels/img20180628103139.png; No such file or directory 2018-07-07 22:31:25.183882: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at whole_file_read_ops.cc:114 : Not found: ./datas/labels/img20180628111617.png; No such file or directory 2018-07-07 22:31:25.183984: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at whole_file_read_ops.cc:114 : Not found: ./datas/labels/img20180628123638.png; No such file or directory Traceback (most recent call last): File "/MYZFS/Personal/joe/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1322, in _do_call return fn(*args) File "/MYZFS/Personal/joe/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1307, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/MYZFS/Personal/joe/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.NotFoundError: ./datas/labels/img20180628150222.png; No such file or directory [[Node: ReadFile_1 = ReadFile]] [[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,1024,768,3], [?,512,384,1]], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "data.py", line 140, in img_batch, xys_batch = sess.run([imgs, xyss]) File "/MYZFS/Personal/joe/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 900, in run run_metadata_ptr) File "/MYZFS/Personal/joe/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1135, in _run feed_dict_tensor, options, run_metadata) File "/MYZFS/Personal/joe/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run run_metadata) File "/MYZFS/Personal/joe/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.NotFoundError: ./datas/labels/img20180628150222.png; No such file or directory [[Node: ReadFile_1 = ReadFile]] [[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,1024,768,3], [?,512,384,1]], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]`

matpalm commented 6 years ago

added some repro steps to the README; hadn't actually done any yet :/

also added --height and --width where required to remove the assumption on image size / orientation

matpalm / bnn

How to training... #3