wentaoyuan / pcn

Code for PCN: Point Completion Network in 3DV'18 (Oral)
https://wentaoyuan.github.io/pcn/
MIT License
421 stars 85 forks source link

shapenet lmdb loading error INFO:tensorflow:Error reported to Coordinator: <class 'pyarrow.lib.ArrowIOError'>, Expected to be able to read 196608 bytes for message body, got 196576" #2

Closed yifita closed 6 years ago

yifita commented 6 years ago

INFO:tensorflow:Error reported to Coordinator: <class 'pyarrow.lib.ArrowIOError'>, Expected to be able to read 196608 bytes for message body, got 196576

Hi this error happens for both valid.lmdb and train.lmdb

wentaoyuan commented 6 years ago

Try upgrading tensorpack with pip install --upgrade git+https://github.com/tensorpack/tensorpack.git

yifita commented 6 years ago

no, doesn't run through

$ python train.py --lmdb_train ~/Downloads/train.lmdb --lmdb_valid ~/Downloads/valid.lmdb
[0809 10:27:50 @format.py:92] Found 231792 entries in /home/ywang/Downloads/train.lmdb
[0809 10:27:50 @develop.py:94] WRN [Deprecated] LMDBDataPoint will be deprecated after 31 Jan. Use LMDBSerializer.load() instead!
[0809 10:27:50 @parallel.py:290] [PrefetchDataZMQ] Will fork a dataflow more than one times. This assumes the datapoints are i.i.d.
[0809 10:27:50 @format.py:92] Found 800 entries in /home/ywang/Downloads/valid.lmdb
[0809 10:27:50 @develop.py:94] WRN [Deprecated] LMDBDataPoint will be deprecated after 31 Jan. Use LMDBSerializer.load() instead!
Process _Worker-2:
Process _Worker-3:
Process _Worker-4:
Process _Worker-5:
Traceback (most recent call last):
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/multiprocessing/process.py", line 252, in _bootstrap
    self.run()
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/parallel.py", line 266, in run
    for dp in self.ds.get_data():
  File "/home/ywang/Documents/points/igl-point-completion/pcn/data_util.py", line 21, in get_data
    for id, input, gt in self.ds.get_data():
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/common.py", line 275, in get_data
    ret = self.func(copy(dp))  # shallow copy the list
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/format.py", line 178, in f
    return loads(dp[1])
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/utils/serialize.py", line 42, in loads_pyarrow
    return pa.deserialize(buf)
  File "pyarrow/serialization.pxi", line 441, in pyarrow.lib.deserialize
Traceback (most recent call last):
  File "pyarrow/serialization.pxi", line 403, in pyarrow.lib.deserialize_from
  File "pyarrow/serialization.pxi", line 380, in pyarrow.lib.read_serialized
  File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/multiprocessing/process.py", line 252, in _bootstrap
    self.run()
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/parallel.py", line 266, in run
    for dp in self.ds.get_data():
pyarrow.lib.ArrowIOError: Expected to be able to read 196608 bytes for message body, got 196576
  File "/home/ywang/Documents/points/igl-point-completion/pcn/data_util.py", line 21, in get_data
    for id, input, gt in self.ds.get_data():
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/common.py", line 275, in get_data
    ret = self.func(copy(dp))  # shallow copy the list
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/format.py", line 178, in f
    return loads(dp[1])
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/utils/serialize.py", line 42, in loads_pyarrow
    return pa.deserialize(buf)
  File "pyarrow/serialization.pxi", line 441, in pyarrow.lib.deserialize
  File "pyarrow/serialization.pxi", line 403, in pyarrow.lib.deserialize_from
  File "pyarrow/serialization.pxi", line 380, in pyarrow.lib.read_serialized
  File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Expected to be able to read 196608 bytes for message body, got 196576
Process _Worker-6:
Traceback (most recent call last):
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/multiprocessing/process.py", line 252, in _bootstrap
    self.run()
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/parallel.py", line 266, in run
    for dp in self.ds.get_data():
  File "/home/ywang/Documents/points/igl-point-completion/pcn/data_util.py", line 21, in get_data
    for id, input, gt in self.ds.get_data():
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/common.py", line 275, in get_data
    ret = self.func(copy(dp))  # shallow copy the list
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/format.py", line 178, in f
    return loads(dp[1])
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/utils/serialize.py", line 42, in loads_pyarrow
    return pa.deserialize(buf)
  File "pyarrow/serialization.pxi", line 441, in pyarrow.lib.deserialize
  File "pyarrow/serialization.pxi", line 403, in pyarrow.lib.deserialize_from
  File "pyarrow/serialization.pxi", line 380, in pyarrow.lib.read_serialized
  File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Invalid flatbuffers message.
Traceback (most recent call last):
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/multiprocessing/process.py", line 252, in _bootstrap
    self.run()
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/parallel.py", line 266, in run
    for dp in self.ds.get_data():
  File "/home/ywang/Documents/points/igl-point-completion/pcn/data_util.py", line 21, in get_data
    for id, input, gt in self.ds.get_data():
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/common.py", line 275, in get_data
    ret = self.func(copy(dp))  # shallow copy the list
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/format.py", line 178, in f
    return loads(dp[1])
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/utils/serialize.py", line 42, in loads_pyarrow
    return pa.deserialize(buf)
  File "pyarrow/serialization.pxi", line 441, in pyarrow.lib.deserialize
  File "pyarrow/serialization.pxi", line 403, in pyarrow.lib.deserialize_from
  File "pyarrow/serialization.pxi", line 380, in pyarrow.lib.read_serialized
  File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Expected to be able to read 196608 bytes for message body, got 196576
Process _Worker-7:
Process _Worker-8:
Traceback (most recent call last):
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/multiprocessing/process.py", line 252, in _bootstrap
    self.run()
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/parallel.py", line 266, in run
    for dp in self.ds.get_data():
  File "/home/ywang/Documents/points/igl-point-completion/pcn/data_util.py", line 21, in get_data
    for id, input, gt in self.ds.get_data():
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/common.py", line 275, in get_data
    ret = self.func(copy(dp))  # shallow copy the list
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/format.py", line 178, in f
    return loads(dp[1])
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/utils/serialize.py", line 42, in loads_pyarrow
    return pa.deserialize(buf)
  File "pyarrow/serialization.pxi", line 441, in pyarrow.lib.deserialize
  File "pyarrow/serialization.pxi", line 403, in pyarrow.lib.deserialize_from
  File "pyarrow/serialization.pxi", line 380, in pyarrow.lib.read_serialized
  File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Invalid flatbuffers message.
Process _Worker-9:
Traceback (most recent call last):
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/multiprocessing/process.py", line 252, in _bootstrap
    self.run()
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/parallel.py", line 266, in run
    for dp in self.ds.get_data():
  File "/home/ywang/Documents/points/igl-point-completion/pcn/data_util.py", line 21, in get_data
    for id, input, gt in self.ds.get_data():
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/common.py", line 275, in get_data
    ret = self.func(copy(dp))  # shallow copy the list
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/format.py", line 178, in f
    return loads(dp[1])
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/utils/serialize.py", line 42, in loads_pyarrow
    return pa.deserialize(buf)
  File "pyarrow/serialization.pxi", line 441, in pyarrow.lib.deserialize
  File "pyarrow/serialization.pxi", line 403, in pyarrow.lib.deserialize_from
  File "pyarrow/serialization.pxi", line 380, in pyarrow.lib.read_serialized
  File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Invalid flatbuffers message.
Traceback (most recent call last):
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/multiprocessing/process.py", line 252, in _bootstrap
    self.run()
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/parallel.py", line 266, in run
    for dp in self.ds.get_data():
  File "/home/ywang/Documents/points/igl-point-completion/pcn/data_util.py", line 21, in get_data
    for id, input, gt in self.ds.get_data():
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/common.py", line 275, in get_data
    ret = self.func(copy(dp))  # shallow copy the list
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/format.py", line 178, in f
    return loads(dp[1])
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/utils/serialize.py", line 42, in loads_pyarrow
    return pa.deserialize(buf)
  File "pyarrow/serialization.pxi", line 441, in pyarrow.lib.deserialize
  File "pyarrow/serialization.pxi", line 403, in pyarrow.lib.deserialize_from
  File "pyarrow/serialization.pxi", line 380, in pyarrow.lib.read_serialized
  File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Invalid flatbuffers message.
Traceback (most recent call last):
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/multiprocessing/process.py", line 252, in _bootstrap
    self.run()
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/parallel.py", line 266, in run
    for dp in self.ds.get_data():
  File "/home/ywang/Documents/points/igl-point-completion/pcn/data_util.py", line 21, in get_data
    for id, input, gt in self.ds.get_data():
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/common.py", line 275, in get_data
    ret = self.func(copy(dp))  # shallow copy the list
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/format.py", line 178, in f
    return loads(dp[1])
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/utils/serialize.py", line 42, in loads_pyarrow
    return pa.deserialize(buf)
  File "pyarrow/serialization.pxi", line 441, in pyarrow.lib.deserialize
  File "pyarrow/serialization.pxi", line 403, in pyarrow.lib.deserialize_from
  File "pyarrow/serialization.pxi", line 380, in pyarrow.lib.read_serialized
  File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Invalid flatbuffers message.
2018-08-09 10:27:51.425688: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-08-09 10:27:51.584502: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:0a:00.0
totalMemory: 10.92GiB freeMemory: 10.75GiB
2018-08-09 10:27:51.694632: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-08-09 10:27:51.695282: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 1 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:41:00.0
totalMemory: 10.91GiB freeMemory: 9.18GiB
2018-08-09 10:27:51.696216: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0, 1
2018-08-09 10:27:52.097552: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-08-09 10:27:52.097591: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929]      0 1
2018-08-09 10:27:52.097600: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0:   N Y
2018-08-09 10:27:52.097613: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 1:   Y N
2018-08-09 10:27:52.098034: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10400 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:0a:00.0, compute capability: 6.1)
2018-08-09 10:27:52.098656: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 8878 MB memory) -> physical GPU (device: 1, name: GeForce GTX 1080 Ti, pci bus id: 0000:41:00.0, compute capability: 6.1)
log/pcn_cd exists. Delete? [y (or enter)/N]n
Total time 0:00:00.026979

Traceback (most recent call last):
  File "train.py", line 159, in <module>
    train(args)
  File "train.py", line 133, in train
    coord.join(threads)
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorflow/python/training/coordinator.py", line 389, in join
    six.reraise(*self._exc_info_to_raise)
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/six.py", line 693, in reraise
    raise value
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorflow/python/estimator/inputs/queues/feeding_queue_runner.py", line 93, in _run
    feed_dict = None if feed_fn is None else feed_fn()
  File "/home/ywang/Documents/points/igl-point-completion/pcn/data_util.py", line 51, in <lambda>
    feed_fn = lambda: {placeholder: value for placeholder, value in zip(placeholders, next(generator))}
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/common.py", line 339, in get_data
    for dp in self.ds.get_data():
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/common.py", line 116, in get_data
    for data in self.ds.get_data():
  File "/home/ywang/Documents/points/igl-point-completion/pcn/data_util.py", line 21, in get_data
    for id, input, gt in self.ds.get_data():
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/common.py", line 275, in get_data
    ret = self.func(copy(dp))  # shallow copy the list
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/format.py", line 178, in f
    return loads(dp[1])
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/utils/serialize.py", line 42, in loads_pyarrow
    return pa.deserialize(buf)
  File "pyarrow/serialization.pxi", line 441, in pyarrow.lib.deserialize
  File "pyarrow/serialization.pxi", line 403, in pyarrow.lib.deserialize_from
  File "pyarrow/serialization.pxi", line 380, in pyarrow.lib.read_serialized
  File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Expected to be able to read 196608 bytes for message body, got 196576
[0809 10:30:01 @parallel.py:77] [PrefetchDataZMQ] Context terminated.
PrefetchDataZMQ successfully cleaned-up.
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/parallel.py", line 75, in _zmq_catch_error
    yield
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/parallel.py", line 308, in get_data
    yield self._recv()
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/parallel.py", line 298, in _recv
    return loads(self.socket.recv(copy=False))
  File "zmq/backend/cython/socket.pyx", line 792, in zmq.backend.cython.socket.Socket.recv
  File "zmq/backend/cython/socket.pyx", line 830, in zmq.backend.cython.socket.Socket.recv
  File "zmq/backend/cython/socket.pyx", line 172, in zmq.backend.cython.socket._recv_frame
  File "zmq/backend/cython/checkrc.pxd", line 22, in zmq.backend.cython.checkrc._check_rc
zmq.error.ContextTerminated: Context was terminated

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/threading.py", line 914, in _bootstrap_inner
    self.run()
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/threading.py", line 862, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorflow/python/estimator/inputs/queues/feeding_queue_runner.py", line 93, in _run
    feed_dict = None if feed_fn is None else feed_fn()
  File "/home/ywang/Documents/points/igl-point-completion/pcn/data_util.py", line 51, in <lambda>
    feed_fn = lambda: {placeholder: value for placeholder, value in zip(placeholders, next(generator))}
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/common.py", line 339, in get_data
    for dp in self.ds.get_data():
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/common.py", line 116, in get_data
    for data in self.ds.get_data():
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/parallel.py", line 308, in get_data
    yield self._recv()
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/contextlib.py", line 77, in __exit__
    self.gen.throw(type, value, traceback)
  File "/home/ywang/anaconda3/envs/tf/lib/python3.5/site-packages/tensorpack/dataflow/parallel.py", line 78, in _zmq_catch_error
    raise DataFlowTerminated()
tensorpack.dataflow.base.DataFlowTerminated

PrefetchDataZMQ successfully cleaned-up.
wentaoyuan commented 6 years ago

What is your pyarrow version? I have 0.9.0.

wentaoyuan commented 6 years ago

See #3.

dss010101 commented 1 year ago

seeing a similar error. not using tensor though...