trailbehind / DeepOSM

Train a deep learning net with OpenStreetMap features and satellite imagery.
MIT License
1.31k stars 182 forks source link

Getting warning while Training neural net #75

Open sahil210695 opened 7 years ago

sahil210695 commented 7 years ago

Getthing this while running the train_neural_net.py

WARNING:tensorflow:Error encountered when serializing data_augmentation. Type is unsupported, or the types of the items don't match field type in CollectionDef. 'NoneType' object has no attribute 'name' WARNING:tensorflow:Error encountered when serializing summary_tags. Type is unsupported, or the types of the items don't match field type in CollectionDef. 'dict' object has no attribute 'name' WARNING:tensorflow:Error encountered when serializing data_preprocessing. Type is unsupported, or the types of the items don't match field type in CollectionDef. 'NoneType' object has no attribute 'name'

andrewljohnson commented 7 years ago

Is this an error or just a warning?

And is this when running using a GPU or without a GPU?

sahil210695 commented 7 years ago

After this the training stopped, so I'll take that as error. This while running without the GPU on MacBook pro with macOS Sierra.

jmontrose commented 7 years ago

Maybe just need to update tflearn? This looks fixed upstream: https://github.com/tflearn/tflearn/issues/523

sahil210695 commented 7 years ago

@jmontrose @andrewljohnson can you explain a bit more ? because the in the docker version of the tensorflow is 0.8

jmontrose commented 7 years ago

@sahil210695 are you pointing out that the issue in tflearn#523 seems to be coming from tf 0.12.0 compatibility? That's a good point. I found that issue trying to track down the source of the message itself (meaning to dump a stack trace) I'll keep looking, and if you have any ideas I can help test, send them over! Oh, and although I do get the same error reported by @andrewljohnson , I do see that I have model pickles after the failure, so it might not be completely fatal:

root@d6578b22ecc5:/DeepOSM# ls -lha data/generated/
total 8.2M
drwxr-xr-x    11 root root  374 Feb 20 00:08 .
drwxr-xr-x     5 root root  170 Feb 19 22:37 ..
-rw-r--r--     1 root root  129 Feb 20 00:08 checkpoint
-rw-r--r--     1 root root 8.1M Feb 20 00:08 model.pickle
-rw-r--r--     1 root root 178K Feb 20 00:08 model.pickle.meta
-rw-r--r--     1 root root  105 Feb 20 00:08 model_metadata.pickle
-rw-r--r--     1 root root  384 Feb 19 22:44 raster_data_paths.pickle
drwxr-xr-x 55252 root root 1.8M Feb 19 22:55 training_images
drwxr-xr-x 55252 root root 1.8M Feb 19 22:55 training_labels
-rw-r--r--     1 root root   88 Feb 19 22:55 training_metadata.pickle
drwxr-xr-x     4 root root  136 Feb 19 22:37 way_bitmaps
sahil210695 commented 7 years ago

But I'm not getting it. I'm stuck there only and results are not also generated after that also. are you able to get the results @jmontrose previously docker came with tensorflow version 0.8.0, now I've upgraded it to version 0.10 and TFLearn is 0.1 and this is mine drwxr-xr-x 12 sahilkumar staff 408B Feb 20 21:34 . drwxr-xr-x 6 sahilkumar staff 204B Feb 12 22:20 .. -rw-r--r--@ 1 sahilkumar staff 8.0K Feb 20 12:18 .DS_Store -rw-r--r-- 1 sahilkumar staff 129B Feb 20 21:34 checkpoint -rw-r--r-- 1 sahilkumar staff 8.0M Feb 20 21:34 model.pickle -rw-r--r-- 1 sahilkumar staff 177K Feb 20 21:34 model.pickle.meta -rw-r--r-- 1 sahilkumar staff 105B Feb 20 21:34 model_metadata.pickle -rw-r--r-- 1 sahilkumar staff 384B Feb 12 22:26 raster_data_paths.pickle drwxr-xr-x 55252 sahilkumar staff 1.8M Feb 12 22:36 training_images drwxr-xr-x 55252 sahilkumar staff 1.8M Feb 12 22:36 training_labels -rw-r--r-- 1 sahilkumar staff 88B Feb 12 22:36 training_metadata.pickle drwxr-xr-x 5 sahilkumar staff 170B Feb 12 22:32 way_bitmaps

sahil210695 commented 7 years ago

@andrewljohnson I'm not getting the output except these warnings. could you please tell me how to check for output ???

jmontrose commented 7 years ago

Here is my data/way_bitmaps dir. Looks like it's failing before it renders the jpegs?

bash-3.2$ find way_bitmaps/
way_bitmaps/
way_bitmaps//38075
way_bitmaps//38075/m_3807503_ne_18_1_20130907.tif-ways.bitmap.npy
way_bitmaps//38075/m_3807503_nw_18_1_20130907.tif-ways.bitmap.npy
way_bitmaps//38075/m_3807503_se_18_1_20130907.tif-ways.bitmap.npy
way_bitmaps//38075/m_3807503_sw_18_1_20130907.tif-ways.bitmap.npy
way_bitmaps//38075/m_3807504_ne_18_1_20130907.tif-ways.bitmap.npy
way_bitmaps//38075/m_3807504_nw_18_1_20130907.tif-ways.bitmap.npy
way_bitmaps//39075
sahil210695 commented 7 years ago

@jmontrose Here is my data/way_bitmaps dir.

root@4881763ee755:/DeepOSM/data/generated# find way_bitmaps/ way_bitmaps/ way_bitmaps/.DS_Store way_bitmaps/38075 way_bitmaps/38075/m_3807503_ne_18_1_20130907.tif-ways.bitmap.npy way_bitmaps/38075/m_3807503_nw_18_1_20130907.tif-ways.bitmap.npy way_bitmaps/38075/m_3807503_se_18_1_20130907.tif-ways.bitmap.npy way_bitmaps/38075/m_3807503_sw_18_1_20130907.tif-ways.bitmap.npy way_bitmaps/38075/m_3807504_ne_18_1_20130907.tif-ways.bitmap.npy way_bitmaps/38075/m_3807504_nw_18_1_20130907.tif-ways.bitmap.npy way_bitmaps/39075

andrewjavao commented 7 years ago

I met this issue also, ran the container on Linux CentOS 7, no GPU. Since I'm new to tflearn and tensorFlow, so I have no clue about this, so any process for you guys? I'm looking into python code here for more information myself, it seems like some data format and library version compatiblity issue to me, but no not quite sure if this is the right direction.

sahil210695 commented 7 years ago

@andrewjavao No success till now. I am also looking into the python code and let's see where we can get. It's issue with TFLearn with tensorflow.

andrewjavao commented 7 years ago

@andrewljohnson We've been stuck in this issue for weeks, have u somehow figure it out?

andrewljohnson commented 7 years ago

I think this should be fixed, but it's just a warning.

The more relevant error is #79