I created a multiprocess_generate supporting problem, using ChoppedTextProblem as a template.
When I actually run the generation, it eventually froze. I used Pyrasite to get stack dumps from the running process. The stacks that weren't waiting in multiprocessing ended like this:
File "/usr/lib64/python3.6/multiprocessing/popen_fork.py", line 66, in _launch
self.pid = os.fork()
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/c_api_util.py", line 50, in __del__
c_api.TF_DeleteGraph(self.graph)
I found a tensorflow issue: tensorflow/tensorflow#8220 which isn't exactly relevant, but it says not to use tensorflow with multiprocessing.
This happened close to the end of generation, so I believe it freezes when the first, or first few, processes try to shut down, when they all try to delete the same (default) TF graph, that t2t-datagen doesn't even use.
The only subclass of ChoppedTextProblem is languagemodel_wiki_xml_v8k_l1k so I'm trying data generation for that problem to see if it also hangs/errors.
Description
I created a multiprocess_generate supporting problem, using ChoppedTextProblem as a template. When I actually run the generation, it eventually froze. I used Pyrasite to get stack dumps from the running process. The stacks that weren't waiting in
multiprocessing
ended like this:I found a tensorflow issue: tensorflow/tensorflow#8220 which isn't exactly relevant, but it says not to use tensorflow with multiprocessing. This happened close to the end of generation, so I believe it freezes when the first, or first few, processes try to shut down, when they all try to delete the same (default) TF graph, that t2t-datagen doesn't even use.
TensorFlow and tensor2tensor versions