I have prepared a small dataset just for trying out the network and see how it works. It seems like that its able to load the data set well and prints (Begin Training) but after that it just stops and do nothing.Here is what i see on screen:
CUDA_VISIBLE_DEVICES=0 python ./main.py --train_dir=./imgs/train/ --val_dir=./imgs/val/ --image_height=60 --image_width=180 --image_channel=1 --out_channels=64 --num_hidden=128 --batch_size=128 --log_dir=./log/train --num_gpus=1 --mode=train
2018-05-29 11:47:19.300427: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-05-29 11:47:19.954690: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-05-29 11:47:19.955398: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties:
name: GeForce GTX 960M major: 5 minor: 0 memoryClockRate(GHz): 1.176
pciBusID: 0000:01:00.0
totalMemory: 3.95GiB freeMemory: 3.50GiB
2018-05-29 11:47:19.955416: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2018-05-29 11:47:20.485722: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-05-29 11:47:20.485760: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0
2018-05-29 11:47:20.485768: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N
2018-05-29 11:47:20.485968: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3237 MB memory) -> physical GPU (device: 0, name: GeForce GTX 960M, pci bus id: 0000:01:00.0, compute capability: 5.0)
=============================begin training=============================
as you can see Training does not begin and i dont get any errors either
Hi Guys
I have prepared a small dataset just for trying out the network and see how it works. It seems like that its able to load the data set well and prints (Begin Training) but after that it just stops and do nothing.Here is what i see on screen: CUDA_VISIBLE_DEVICES=0 python ./main.py --train_dir=./imgs/train/ --val_dir=./imgs/val/ --image_height=60 --image_width=180 --image_channel=1 --out_channels=64 --num_hidden=128 --batch_size=128 --log_dir=./log/train --num_gpus=1 --mode=train
feature_h: 4, feature_w: 12 lstm input shape: [128, 12, 256] loading train data ('size: ', 11) loading validation data size: 6
2018-05-29 11:47:19.300427: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2018-05-29 11:47:19.954690: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2018-05-29 11:47:19.955398: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties: name: GeForce GTX 960M major: 5 minor: 0 memoryClockRate(GHz): 1.176 pciBusID: 0000:01:00.0 totalMemory: 3.95GiB freeMemory: 3.50GiB 2018-05-29 11:47:19.955416: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0 2018-05-29 11:47:20.485722: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix: 2018-05-29 11:47:20.485760: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0 2018-05-29 11:47:20.485768: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N 2018-05-29 11:47:20.485968: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3237 MB memory) -> physical GPU (device: 0, name: GeForce GTX 960M, pci bus id: 0000:01:00.0, compute capability: 5.0) =============================begin training============================= as you can see Training does not begin and i dont get any errors either