Open sahil210695 opened 7 years ago
Is this an error or just a warning?
And is this when running using a GPU or without a GPU?
After this the training stopped, so I'll take that as error. This while running without the GPU on MacBook pro with macOS Sierra.
Maybe just need to update tflearn? This looks fixed upstream: https://github.com/tflearn/tflearn/issues/523
@jmontrose @andrewljohnson can you explain a bit more ? because the in the docker version of the tensorflow is 0.8
@sahil210695 are you pointing out that the issue in tflearn#523 seems to be coming from tf 0.12.0 compatibility? That's a good point. I found that issue trying to track down the source of the message itself (meaning to dump a stack trace) I'll keep looking, and if you have any ideas I can help test, send them over! Oh, and although I do get the same error reported by @andrewljohnson , I do see that I have model pickles after the failure, so it might not be completely fatal:
root@d6578b22ecc5:/DeepOSM# ls -lha data/generated/
total 8.2M
drwxr-xr-x 11 root root 374 Feb 20 00:08 .
drwxr-xr-x 5 root root 170 Feb 19 22:37 ..
-rw-r--r-- 1 root root 129 Feb 20 00:08 checkpoint
-rw-r--r-- 1 root root 8.1M Feb 20 00:08 model.pickle
-rw-r--r-- 1 root root 178K Feb 20 00:08 model.pickle.meta
-rw-r--r-- 1 root root 105 Feb 20 00:08 model_metadata.pickle
-rw-r--r-- 1 root root 384 Feb 19 22:44 raster_data_paths.pickle
drwxr-xr-x 55252 root root 1.8M Feb 19 22:55 training_images
drwxr-xr-x 55252 root root 1.8M Feb 19 22:55 training_labels
-rw-r--r-- 1 root root 88 Feb 19 22:55 training_metadata.pickle
drwxr-xr-x 4 root root 136 Feb 19 22:37 way_bitmaps
But I'm not getting it. I'm stuck there only and results are not also generated after that also. are you able to get the results @jmontrose previously docker came with tensorflow version 0.8.0, now I've upgraded it to version 0.10 and TFLearn is 0.1 and this is mine drwxr-xr-x 12 sahilkumar staff 408B Feb 20 21:34 . drwxr-xr-x 6 sahilkumar staff 204B Feb 12 22:20 .. -rw-r--r--@ 1 sahilkumar staff 8.0K Feb 20 12:18 .DS_Store -rw-r--r-- 1 sahilkumar staff 129B Feb 20 21:34 checkpoint -rw-r--r-- 1 sahilkumar staff 8.0M Feb 20 21:34 model.pickle -rw-r--r-- 1 sahilkumar staff 177K Feb 20 21:34 model.pickle.meta -rw-r--r-- 1 sahilkumar staff 105B Feb 20 21:34 model_metadata.pickle -rw-r--r-- 1 sahilkumar staff 384B Feb 12 22:26 raster_data_paths.pickle drwxr-xr-x 55252 sahilkumar staff 1.8M Feb 12 22:36 training_images drwxr-xr-x 55252 sahilkumar staff 1.8M Feb 12 22:36 training_labels -rw-r--r-- 1 sahilkumar staff 88B Feb 12 22:36 training_metadata.pickle drwxr-xr-x 5 sahilkumar staff 170B Feb 12 22:32 way_bitmaps
@andrewljohnson I'm not getting the output except these warnings. could you please tell me how to check for output ???
Here is my data/way_bitmaps dir. Looks like it's failing before it renders the jpegs?
bash-3.2$ find way_bitmaps/
way_bitmaps/
way_bitmaps//38075
way_bitmaps//38075/m_3807503_ne_18_1_20130907.tif-ways.bitmap.npy
way_bitmaps//38075/m_3807503_nw_18_1_20130907.tif-ways.bitmap.npy
way_bitmaps//38075/m_3807503_se_18_1_20130907.tif-ways.bitmap.npy
way_bitmaps//38075/m_3807503_sw_18_1_20130907.tif-ways.bitmap.npy
way_bitmaps//38075/m_3807504_ne_18_1_20130907.tif-ways.bitmap.npy
way_bitmaps//38075/m_3807504_nw_18_1_20130907.tif-ways.bitmap.npy
way_bitmaps//39075
@jmontrose Here is my data/way_bitmaps dir.
root@4881763ee755:/DeepOSM/data/generated# find way_bitmaps/ way_bitmaps/ way_bitmaps/.DS_Store way_bitmaps/38075 way_bitmaps/38075/m_3807503_ne_18_1_20130907.tif-ways.bitmap.npy way_bitmaps/38075/m_3807503_nw_18_1_20130907.tif-ways.bitmap.npy way_bitmaps/38075/m_3807503_se_18_1_20130907.tif-ways.bitmap.npy way_bitmaps/38075/m_3807503_sw_18_1_20130907.tif-ways.bitmap.npy way_bitmaps/38075/m_3807504_ne_18_1_20130907.tif-ways.bitmap.npy way_bitmaps/38075/m_3807504_nw_18_1_20130907.tif-ways.bitmap.npy way_bitmaps/39075
I met this issue also, ran the container on Linux CentOS 7, no GPU. Since I'm new to tflearn and tensorFlow, so I have no clue about this, so any process for you guys? I'm looking into python code here for more information myself, it seems like some data format and library version compatiblity issue to me, but no not quite sure if this is the right direction.
@andrewjavao No success till now. I am also looking into the python code and let's see where we can get. It's issue with TFLearn with tensorflow.
@andrewljohnson We've been stuck in this issue for weeks, have u somehow figure it out?
I think this should be fixed, but it's just a warning.
The more relevant error is #79
Getthing this while running the train_neural_net.py
WARNING:tensorflow:Error encountered when serializing data_augmentation. Type is unsupported, or the types of the items don't match field type in CollectionDef. 'NoneType' object has no attribute 'name' WARNING:tensorflow:Error encountered when serializing summary_tags. Type is unsupported, or the types of the items don't match field type in CollectionDef. 'dict' object has no attribute 'name' WARNING:tensorflow:Error encountered when serializing data_preprocessing. Type is unsupported, or the types of the items don't match field type in CollectionDef. 'NoneType' object has no attribute 'name'