zhang0jhon / AttentionOCR

Scene text recognition
834 stars 259 forks source link

Something Wrong When Run "python train.py" #68

Open Tian14267 opened 4 years ago

Tian14267 commented 4 years ago

I get Wrong when run "python train.py"

Number of trainable variables: 241 Number of parameters (elements): 34191804 Storage space needed for all trainable variables: 130.43MB [0427 14:42:35 @base.py:207] Setup callbacks graph ... [0427 14:42:39 @prof.py:291] [HostMemoryTracker] Free RAM in setup_graph() is 55.66 GB. [0427 14:42:39 @argtools.py:138] WRN Starting a process with 'fork' method is not safe and may consume unnecessary extra CPU memory. Use 'forkserver' or 'spawn' method (available after Py3.4) instead if you run into any issues. See https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods on how to set them. [0427 14:42:39 @summary.py:47] [MovingAverageSummary] 0 operations in collection 'MOVING_SUMMARY_OPS' will be run with session hooks. [0427 14:42:39 @summary.py:94] Summarizing collection 'summaries' of size 288. [0427 14:42:39 @graph.py:99] Applying collection UPDATE_OPS of 226 ops. [0427 14:42:39 @sessinit.py:134] Variable global_step:0 in the graph will not be loaded from the checkpoint! [0427 14:42:39 @sessinit.py:87] WRN The following variables are in the graph, but not found in the checkpoint: InceptionV4/attention_lstm/word_embedding/W_wemb, InceptionV4/attention_lstm/feature_map_attention/init_mean/W_init_c, InceptionV4/attention_lstm/feature_map_attention/init_mean/W_init_h, InceptionV4/attention_lstm/feature_map_attention/attention_x/W, InceptionV4/attention_lstm/feature_map_attention/attention_h/W_h, InceptionV4/attention_lstm/feature_map_attention/att/W_att, InceptionV4/attention_lstm/feature_map_attention/att/b_att, InceptionV4/attention_lstm/softmax/softmax_w, InceptionV4/attention_lstm/softmax/softmax_b, InceptionV4/attention_lstm/attention_to_embedding/W_attention_wemb, InceptionV4/attention_lstm/attention_to_embedding/W_hidden_wemd, InceptionV4/attention_lstm/lstm_cell/lstm_W, InceptionV4/attention_lstm/lstm_cell/lstm_U, InceptionV4/attention_lstm/lstm_cell/lstm_Z, InceptionV4/attention_lstm/lstm_cell/lstm_b, learning_rate [0427 14:42:39 @sessinit.py:87] WRN The following variables are in the checkpoint, but not found in the graph: InceptionV4/AuxLogits/Aux_logits/biases, InceptionV4/AuxLogits/Aux_logits/weights, InceptionV4/AuxLogits/Conv2d_1b_1x1/BatchNorm/beta, Ince ... ... ... tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op 'MaxBytesInUse' used by node GPUMemoryTracker/MaxBytesInUse (defined at /home/hj/.pyenv/versions/fffan-env/lib/python3.6/site-packages/tensorpack/callbacks/prof.py:261) with these attrs: [] Registered devices: [CPU, XLA_CPU, XLA_GPU] Registered kernels: device='GPU'

 [[GPUMemoryTracker/MaxBytesInUse]]

terminate called without an active exception terminate called recursively Received signal 6 BEGIN MANGLED STACK TRACE terminate called recursively Aborted (core dumped)

Do you know How to solve this problem?

zhang0jhon commented 4 years ago

It seems to be a tensorpack problem. Maybe you can comment the "GPUUtilizationTracker()" in train.py.