eragonruan / text-detection-ctpn

text detection mainly based on ctpn model in tensorflow, id card detect, connectionist text proposal network
MIT License
3.43k stars 1.33k forks source link

WARNING:tensorflow:Variable Conv/weights missing in checkpoint data/vgg_16.ckpt (new version) #272

Open hcnhatnam opened 5 years ago

hcnhatnam commented 5 years ago

When I try to execute the main/train.py in new version code and receiving this error /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gradients_impl.py:112: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory. "Converting sparse IndexedSlices to a dense Tensor of unknown shape. " WARNING:tensorflow:Variable Conv/weights missing in checkpoint data/vgg_16.ckpt WARNING:tensorflow:Variable Conv/biases missing in checkpoint data/vgg_16.ckpt WARNING:tensorflow:Variable BiLSTM/bidirectional_rnn/fw/lstm_cell/kernel missing in checkpoint data/vgg_16.ckpt WARNING:tensorflow:Variable BiLSTM/bidirectional_rnn/fw/lstm_cell/bias missing in checkpoint data/vgg_16.ckpt WARNING:tensorflow:Variable BiLSTM/bidirectional_rnn/bw/lstm_cell/kernel missing in checkpoint data/vgg_16.ckpt WARNING:tensorflow:Variable BiLSTM/bidirectional_rnn/bw/lstm_cell/bias missing in checkpoint data/vgg_16.ckpt WARNING:tensorflow:Variable BiLSTM/weights missing in checkpoint data/vgg_16.ckpt WARNING:tensorflow:Variable BiLSTM/biases missing in checkpoint data/vgg_16.ckpt WARNING:tensorflow:Variable bbox_pred/weights missing in checkpoint data/vgg_16.ckpt WARNING:tensorflow:Variable bbox_pred/biases missing in checkpoint data/vgg_16.ckpt WARNING:tensorflow:Variable cls_pred/weights missing in checkpoint data/vgg_16.ckpt WARNING:tensorflow:Variable cls_pred/biases missing in checkpoint data/vgg_16.ckpt 2019-01-13 14:04:34.044515: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-01-13 14:04:34.044907: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties: name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235 pciBusID: 0000:00:04.0 totalMemory: 11.17GiB freeMemory: 11.10GiB 2019-01-13 14:04:34.044943: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0 2019-01-13 14:04:34.354396: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-01-13 14:04:34.354475: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0 2019-01-13 14:04:34.354491: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N 2019-01-13 14:04:34.354732: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10869 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7) Find 3422 images Find 3422 images 3422 training images in data/dataset/mlt/ 3422 training images in data/dataset/mlt/ Find 3422 images 3422 training images in data/dataset/mlt/ Find 3422 images 3422 training images in data/dataset/mlt

I have to wait for more than 3 hours but still there("3422 training images in data/dataset/mlt").Can someone please help me with this ?The reason is WARNING:tensorflow ...?

NamNguyenThanh commented 5 years ago

I have the same problem. Waiting for any solution

eragonruan commented 5 years ago

@hcnhatnam @NamNguyenThanh hi, the warning can be ignored, since vgg is only used as a pretrained model, and there remains some useless parameters. for the second problem, the training process did not start. check the dataset path. may be caused by utils/dataset/data_provider.py line 55 and line 57. you should check if the data has been generated correctly

hcnhatnam commented 5 years ago

Thank you @eragonruan. I resolved the problem.

guddulrk commented 5 years ago

Hi @eragonruan I am trying to run the new ctpn, but when I run the to generate the data, it shows me "too many values to unpack (expected 4)". My dataset is in the following format: x1,y1,x2,y2,x3,y3,x4,y4,txt_transcription Should I change the dataset into 4 rectangular values?

Thanks

guddulrk commented 5 years ago

line = line.strip().split(",") x_min, y_min, x_max, y_max = map(int, line) bbox.append([x_min, y_min, x_max, y_max, 1])

does x_min = x1, y_min = y1, x_max = x2, and y_max = y2 ?

Thanks

eragonruan commented 5 years ago

@guddulrk hi, you should run split_label.py first to prepare the dataset. the input of split_label.py is x1,y1,x2,y2,x3,y3,x4,y4. and the output is text segment labelled in xmin,ymin,xmax,ymax. Then you can start the training process

atal-manuja commented 5 years ago

@hcnhatnam @NamNguyenThanh hi, the warning can be ignored, since vgg is only used as a pretrained model, and there remains some useless parameters. for the second problem, the training process did not start. check the dataset path. may be caused by utils/dataset/data_provider.py line 55 and line 57. you should check if the data has been generated correctly

@eragonruan I think the CPU users who dont have much processing power should reduce the save_checkpoint_steps in train.py file at line 24. Slow processor takes more time to execute one step.So reducing the checkpoint steps will make it easy to see it train instead of waiting for minutes/hrs.Hope people find this helpful!

thograce commented 5 years ago

Thank you @eragonruan. I resolved the problem. Could you tell me how to solve this problem?@hcnhatnam

simplify23 commented 5 years ago

I run the train.py but i got the issue , the checkpoint_mlt/ is download from the readme .who can help me

the error is belowed: Traceback (most recent call last): File "/home/simplify/OCR/work?_ctpn/main/train.py", line 118, in tf.app.run() File "/home/simplify/OCR/work?_ctpn/venv/lib/python3.7/site-packages/tensorflow/python/platform/app.py", line 125, in run _sys.exit(main(argv)) File "/home/simplify/OCR/work?_ctpn/main/train.py", line 81, in main restorestep = int(ckpt.split('.')[0].split('')[-1]) ValueError: invalid literal for int() with base 10: ''

qingqing625 commented 4 years ago

@simplify23 I have same problem。Could you tell me how to solve this problem?thank you。

sevany commented 4 years ago

hi please help : I run the train.py but i got the issue . restorestep = int(ckpt.split('.')[0].split('')[-1]) ValueError: invalid literal for int() with base 10: ''

prabhakar-sivanesan commented 4 years ago

@sevany change tf.app.flags.DEFINE_boolean('restore', True, '') to tf.app.flags.DEFINE_boolean('restore', False, '') in train.py file, if you're training from scratch.