Closed yifita closed 6 years ago
Try upgrading tensorpack with pip install --upgrade git+https://github.com/tensorpack/tensorpack.git
no, doesn't run through
$ python train.py --lmdb_train ~/Downloads/train.lmdb --lmdb_valid ~/Downloads/valid.lmdb
[0809 10:27:50 @format.py:92] Found 231792 entries in /home/ywang/Downloads/train.lmdb
[0809 10:27:50 @develop.py:94] WRN [Deprecated] LMDBDataPoint will be deprecated after 31 Jan. Use LMDBSerializer.load() instead!
[0809 10:27:50 @parallel.py:290] [PrefetchDataZMQ] Will fork a dataflow more than one times. This assumes the datapoints are i.i.d.
[0809 10:27:50 @format.py:92] Found 800 entries in /home/ywang/Downloads/valid.lmdb
[0809 10:27:50 @develop.py:94] WRN [Deprecated] LMDBDataPoint will be deprecated after 31 Jan. Use LMDBSerializer.load() instead!
Process _Worker-2:
Process _Worker-3:
Process _Worker-4:
Process _Worker-5:
Traceback (most recent call last):
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/multiprocessing/process.py", line 252, in _bootstrap
self.run()
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/parallel.py", line 266, in run
for dp in self.ds.get_data():
File "/home/ywang/Documents/points/igl-point-completion/pcn/data_util.py", line 21, in get_data
for id, input, gt in self.ds.get_data():
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/common.py", line 275, in get_data
ret = self.func(copy(dp)) # shallow copy the list
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/format.py", line 178, in f
return loads(dp[1])
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/utils/serialize.py", line 42, in loads_pyarrow
return pa.deserialize(buf)
File "pyarrow/serialization.pxi", line 441, in pyarrow.lib.deserialize
Traceback (most recent call last):
File "pyarrow/serialization.pxi", line 403, in pyarrow.lib.deserialize_from
File "pyarrow/serialization.pxi", line 380, in pyarrow.lib.read_serialized
File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/multiprocessing/process.py", line 252, in _bootstrap
self.run()
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/parallel.py", line 266, in run
for dp in self.ds.get_data():
pyarrow.lib.ArrowIOError: Expected to be able to read 196608 bytes for message body, got 196576
File "/home/ywang/Documents/points/igl-point-completion/pcn/data_util.py", line 21, in get_data
for id, input, gt in self.ds.get_data():
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/common.py", line 275, in get_data
ret = self.func(copy(dp)) # shallow copy the list
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/format.py", line 178, in f
return loads(dp[1])
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/utils/serialize.py", line 42, in loads_pyarrow
return pa.deserialize(buf)
File "pyarrow/serialization.pxi", line 441, in pyarrow.lib.deserialize
File "pyarrow/serialization.pxi", line 403, in pyarrow.lib.deserialize_from
File "pyarrow/serialization.pxi", line 380, in pyarrow.lib.read_serialized
File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Expected to be able to read 196608 bytes for message body, got 196576
Process _Worker-6:
Traceback (most recent call last):
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/multiprocessing/process.py", line 252, in _bootstrap
self.run()
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/parallel.py", line 266, in run
for dp in self.ds.get_data():
File "/home/ywang/Documents/points/igl-point-completion/pcn/data_util.py", line 21, in get_data
for id, input, gt in self.ds.get_data():
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/common.py", line 275, in get_data
ret = self.func(copy(dp)) # shallow copy the list
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/format.py", line 178, in f
return loads(dp[1])
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/utils/serialize.py", line 42, in loads_pyarrow
return pa.deserialize(buf)
File "pyarrow/serialization.pxi", line 441, in pyarrow.lib.deserialize
File "pyarrow/serialization.pxi", line 403, in pyarrow.lib.deserialize_from
File "pyarrow/serialization.pxi", line 380, in pyarrow.lib.read_serialized
File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Invalid flatbuffers message.
Traceback (most recent call last):
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/multiprocessing/process.py", line 252, in _bootstrap
self.run()
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/parallel.py", line 266, in run
for dp in self.ds.get_data():
File "/home/ywang/Documents/points/igl-point-completion/pcn/data_util.py", line 21, in get_data
for id, input, gt in self.ds.get_data():
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/common.py", line 275, in get_data
ret = self.func(copy(dp)) # shallow copy the list
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/format.py", line 178, in f
return loads(dp[1])
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/utils/serialize.py", line 42, in loads_pyarrow
return pa.deserialize(buf)
File "pyarrow/serialization.pxi", line 441, in pyarrow.lib.deserialize
File "pyarrow/serialization.pxi", line 403, in pyarrow.lib.deserialize_from
File "pyarrow/serialization.pxi", line 380, in pyarrow.lib.read_serialized
File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Expected to be able to read 196608 bytes for message body, got 196576
Process _Worker-7:
Process _Worker-8:
Traceback (most recent call last):
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/multiprocessing/process.py", line 252, in _bootstrap
self.run()
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/parallel.py", line 266, in run
for dp in self.ds.get_data():
File "/home/ywang/Documents/points/igl-point-completion/pcn/data_util.py", line 21, in get_data
for id, input, gt in self.ds.get_data():
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/common.py", line 275, in get_data
ret = self.func(copy(dp)) # shallow copy the list
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/format.py", line 178, in f
return loads(dp[1])
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/utils/serialize.py", line 42, in loads_pyarrow
return pa.deserialize(buf)
File "pyarrow/serialization.pxi", line 441, in pyarrow.lib.deserialize
File "pyarrow/serialization.pxi", line 403, in pyarrow.lib.deserialize_from
File "pyarrow/serialization.pxi", line 380, in pyarrow.lib.read_serialized
File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Invalid flatbuffers message.
Process _Worker-9:
Traceback (most recent call last):
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/multiprocessing/process.py", line 252, in _bootstrap
self.run()
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/parallel.py", line 266, in run
for dp in self.ds.get_data():
File "/home/ywang/Documents/points/igl-point-completion/pcn/data_util.py", line 21, in get_data
for id, input, gt in self.ds.get_data():
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/common.py", line 275, in get_data
ret = self.func(copy(dp)) # shallow copy the list
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/format.py", line 178, in f
return loads(dp[1])
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/utils/serialize.py", line 42, in loads_pyarrow
return pa.deserialize(buf)
File "pyarrow/serialization.pxi", line 441, in pyarrow.lib.deserialize
File "pyarrow/serialization.pxi", line 403, in pyarrow.lib.deserialize_from
File "pyarrow/serialization.pxi", line 380, in pyarrow.lib.read_serialized
File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Invalid flatbuffers message.
Traceback (most recent call last):
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/multiprocessing/process.py", line 252, in _bootstrap
self.run()
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/parallel.py", line 266, in run
for dp in self.ds.get_data():
File "/home/ywang/Documents/points/igl-point-completion/pcn/data_util.py", line 21, in get_data
for id, input, gt in self.ds.get_data():
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/common.py", line 275, in get_data
ret = self.func(copy(dp)) # shallow copy the list
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/format.py", line 178, in f
return loads(dp[1])
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/utils/serialize.py", line 42, in loads_pyarrow
return pa.deserialize(buf)
File "pyarrow/serialization.pxi", line 441, in pyarrow.lib.deserialize
File "pyarrow/serialization.pxi", line 403, in pyarrow.lib.deserialize_from
File "pyarrow/serialization.pxi", line 380, in pyarrow.lib.read_serialized
File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Invalid flatbuffers message.
Traceback (most recent call last):
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/multiprocessing/process.py", line 252, in _bootstrap
self.run()
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/parallel.py", line 266, in run
for dp in self.ds.get_data():
File "/home/ywang/Documents/points/igl-point-completion/pcn/data_util.py", line 21, in get_data
for id, input, gt in self.ds.get_data():
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/common.py", line 275, in get_data
ret = self.func(copy(dp)) # shallow copy the list
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/format.py", line 178, in f
return loads(dp[1])
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/utils/serialize.py", line 42, in loads_pyarrow
return pa.deserialize(buf)
File "pyarrow/serialization.pxi", line 441, in pyarrow.lib.deserialize
File "pyarrow/serialization.pxi", line 403, in pyarrow.lib.deserialize_from
File "pyarrow/serialization.pxi", line 380, in pyarrow.lib.read_serialized
File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Invalid flatbuffers message.
2018-08-09 10:27:51.425688: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-08-09 10:27:51.584502: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:0a:00.0
totalMemory: 10.92GiB freeMemory: 10.75GiB
2018-08-09 10:27:51.694632: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-08-09 10:27:51.695282: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 1 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:41:00.0
totalMemory: 10.91GiB freeMemory: 9.18GiB
2018-08-09 10:27:51.696216: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0, 1
2018-08-09 10:27:52.097552: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-08-09 10:27:52.097591: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0 1
2018-08-09 10:27:52.097600: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N Y
2018-08-09 10:27:52.097613: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 1: Y N
2018-08-09 10:27:52.098034: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10400 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:0a:00.0, compute capability: 6.1)
2018-08-09 10:27:52.098656: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 8878 MB memory) -> physical GPU (device: 1, name: GeForce GTX 1080 Ti, pci bus id: 0000:41:00.0, compute capability: 6.1)
log/pcn_cd exists. Delete? [y (or enter)/N]n
Total time 0:00:00.026979
Traceback (most recent call last):
File "train.py", line 159, in <module>
train(args)
File "train.py", line 133, in train
coord.join(threads)
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorflow/python/training/coordinator.py", line 389, in join
six.reraise(*self._exc_info_to_raise)
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/six.py", line 693, in reraise
raise value
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorflow/python/estimator/inputs/queues/feeding_queue_runner.py", line 93, in _run
feed_dict = None if feed_fn is None else feed_fn()
File "/home/ywang/Documents/points/igl-point-completion/pcn/data_util.py", line 51, in <lambda>
feed_fn = lambda: {placeholder: value for placeholder, value in zip(placeholders, next(generator))}
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/common.py", line 339, in get_data
for dp in self.ds.get_data():
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/common.py", line 116, in get_data
for data in self.ds.get_data():
File "/home/ywang/Documents/points/igl-point-completion/pcn/data_util.py", line 21, in get_data
for id, input, gt in self.ds.get_data():
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/common.py", line 275, in get_data
ret = self.func(copy(dp)) # shallow copy the list
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/format.py", line 178, in f
return loads(dp[1])
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/utils/serialize.py", line 42, in loads_pyarrow
return pa.deserialize(buf)
File "pyarrow/serialization.pxi", line 441, in pyarrow.lib.deserialize
File "pyarrow/serialization.pxi", line 403, in pyarrow.lib.deserialize_from
File "pyarrow/serialization.pxi", line 380, in pyarrow.lib.read_serialized
File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Expected to be able to read 196608 bytes for message body, got 196576
[0809 10:30:01 @parallel.py:77] [PrefetchDataZMQ] Context terminated.
PrefetchDataZMQ successfully cleaned-up.
Exception in thread Thread-2:
Traceback (most recent call last):
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/parallel.py", line 75, in _zmq_catch_error
yield
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/parallel.py", line 308, in get_data
yield self._recv()
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/parallel.py", line 298, in _recv
return loads(self.socket.recv(copy=False))
File "zmq/backend/cython/socket.pyx", line 792, in zmq.backend.cython.socket.Socket.recv
File "zmq/backend/cython/socket.pyx", line 830, in zmq.backend.cython.socket.Socket.recv
File "zmq/backend/cython/socket.pyx", line 172, in zmq.backend.cython.socket._recv_frame
File "zmq/backend/cython/checkrc.pxd", line 22, in zmq.backend.cython.checkrc._check_rc
zmq.error.ContextTerminated: Context was terminated
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/threading.py", line 914, in _bootstrap_inner
self.run()
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/threading.py", line 862, in run
self._target(*self._args, **self._kwargs)
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorflow/python/estimator/inputs/queues/feeding_queue_runner.py", line 93, in _run
feed_dict = None if feed_fn is None else feed_fn()
File "/home/ywang/Documents/points/igl-point-completion/pcn/data_util.py", line 51, in <lambda>
feed_fn = lambda: {placeholder: value for placeholder, value in zip(placeholders, next(generator))}
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/common.py", line 339, in get_data
for dp in self.ds.get_data():
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/common.py", line 116, in get_data
for data in self.ds.get_data():
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/parallel.py", line 308, in get_data
yield self._recv()
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/contextlib.py", line 77, in __exit__
self.gen.throw(type, value, traceback)
File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/parallel.py", line 78, in _zmq_catch_error
raise DataFlowTerminated()
tensorpack.dataflow.base.DataFlowTerminated
PrefetchDataZMQ successfully cleaned-up.
What is your pyarrow version? I have 0.9.0.
See #3.
seeing a similar error. not using tensor though...
INFO:tensorflow:Error reported to Coordinator: <class 'pyarrow.lib.ArrowIOError'>, Expected to be able to read 196608 bytes for message body, got 196576
Hi this error happens for both valid.lmdb and train.lmdb