ShifuML / shifu

An end-to-end machine learning and data mining framework on Hadoop
https://github.com/ShifuML/shifu/wiki
Apache License 2.0
251 stars 109 forks source link

Distributed TensorFlow Failed on Epochs #684

Open zhangpengshan opened 4 years ago

zhangpengshan commented 4 years ago

https://github.com/tensorflow/models/issues/5633

INFO:tensorflow:An error was raised. This may be due to a preemption in a connected worker or parameter server. The current session will be closed and a new session will be created. Error: Stream removed

sunwenbo commented 2 years ago

请问,这个问题解决了吗?