Open yuzy007 opened 5 years ago
did you slove the problem,
did you slove the problem,
did you slove the problem,
It seems to that you can not pass a literator into the class named "GeneratorEnqueuer()" when using the model of "multiprocessing". So you can divide the class into several functions to avoid the problem. However, the code will be really bad readable . Maybe you can have a good solution of this problem.
Sorry for my bad English.
did you slove the problem,
It seems to that you can not pass a literator into the class named "GeneratorEnqueuer()" when using the model of "multiprocessing". So you can divide the class into several functions to avoid the problem. However, the code will be really bad readable . Maybe you can have a good solution of this problem. Sorry for my bad English.
can you give me your new functions about this question qq1020290041 thank you
在win10上折腾了一整天,发现是python多线程的问题。
使用断点调试,错误定位于/utils/dataset/data_util.py (53lines):
thread = multiprocessing.Process(target=data_generator_task)
将data_generator_task打印出来,得到如下结果:
<function GeneratorEnqueuer.start.
在win10上折腾了一整天,发现是python多线程的问题。 使用断点调试,错误定位于/utils/dataset/data_util.py (53lines): thread = multiprocessing.Process(target=data_generator_task) 将data_generator_task打印出来,得到如下结果: <function GeneratorEnqueuer.start..data_generator_task at 0x000001C0BE358E18> 个人推测的原因: ./main train.py 运行时,会将dataset的image与label读取到内存中,并给出内存地址如0x000001C0BE358E18 data = next(data_generator)则是根据初始内存地址中递增,从而读取下一批的训练数据 但由于python无法从内存地址中调用next(data_generator),所以无法正常运行train.py 建议:转linux平台,10分钟搞掂。
兄弟 有没有办法不转linux啊?怎么解决呢
换成单线程执行就可以了
新人求助!!! 环境: OS:Windows 10 Python:3.6 tf:tf-gpu 报错内容如下: (tf-gpu) C:\Users\yuzy0\Downloads\text-detection-ctpn-banjin-dev>python ./main/train.py C:\ProgramData\Anaconda3\envs\tf-gpu\lib\site-packages\tensorflow\python\ops\gradients_impl.py:112: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory. "Converting sparse IndexedSlices to a dense Tensor of unknown shape. " WARNING:tensorflow:Variable Conv/weights missing in checkpoint data/vgg_16.ckpt WARNING:tensorflow:Variable Conv/biases missing in checkpoint data/vgg_16.ckpt WARNING:tensorflow:Variable BiLSTM/bidirectional_rnn/fw/lstm_cell/kernel missing in checkpoint data/vgg_16.ckpt WARNING:tensorflow:Variable BiLSTM/bidirectional_rnn/fw/lstm_cell/bias missing in checkpoint data/vgg_16.ckpt WARNING:tensorflow:Variable BiLSTM/bidirectional_rnn/bw/lstm_cell/kernel missing in checkpoint data/vgg_16.ckpt WARNING:tensorflow:Variable BiLSTM/bidirectional_rnn/bw/lstm_cell/bias missing in checkpoint data/vgg_16.ckpt WARNING:tensorflow:Variable BiLSTM/weights missing in checkpoint data/vgg_16.ckpt WARNING:tensorflow:Variable BiLSTM/biases missing in checkpoint data/vgg_16.ckpt WARNING:tensorflow:Variable bbox_pred/weights missing in checkpoint data/vgg_16.ckpt WARNING:tensorflow:Variable bbox_pred/biases missing in checkpoint data/vgg_16.ckpt WARNING:tensorflow:Variable cls_pred/weights missing in checkpoint data/vgg_16.ckpt WARNING:tensorflow:Variable cls_pred/biases missing in checkpoint data/vgg_16.ckpt 2019-03-14 00:27:56.149701: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2 2019-03-14 00:27:57.205599: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties: name: GeForce GTX 1070 with Max-Q Design major: 6 minor: 1 memoryClockRate(GHz): 1.2655 pciBusID: 0000:01:00.0 totalMemory: 8.00GiB free2019-03-14 00:27:57.218265: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0 2019-03-14 00:27:57.685618: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-03-14 00:27:57.689994: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0 2019-03-14 00:27:57.693261: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N 2019-03-14 00:27:57.696260: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6553 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070 wit h Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 6.1) continue training from previous checkpoint 50000 Traceback (most recent call last): File "./main/train.py", line 117, in
tf.app.run()
File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\site-packages\tensorflow\python\platform\app.py", line 125, in run
_sys.exit(main(argv))
File "./main/train.py", line 93, in main
data = next(data_generator)
File "C:\Users\yuzy0\Downloads\text-detection-ctpn-banjin-dev\utils\dataset\data_provider.py", line 83, in get_batch
enqueuer.start(max_queue_size=24, workers=num_workers)
File "C:\Users\yuzy0\Downloads\text-detection-ctpn-banjin-dev\utils\dataset\data_util.py", line 60, in start
thread.start()
File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'GeneratorEnqueuer.start..data_generator_task'
(tf-gpu) C:\Users\yuzy0\Downloads\text-detection-ctpn-banjin-dev>Traceback (most recent call last): File "", line 1, in
File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\ProgramData\Anaconda3\envs\tf-gpu\lib\multiprocessing\spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)