Open kpsc opened 2 years ago
"End of sequence" means the data was finished, in general, estimator handle the exception naturally. If you use 'MonitoredTrainingSession' API, it may encounter this log. Which estimator you installed, we offered a version in github: https://github.com/AlibabaPAI/estimator/tree/deeprec
Thanks for your reply. And I have anthor question, when I used grpc++ in distributed training, it's slow than grpc, is there anything else about training set? In the network, I only used normal embedding with tensorflow
There's list of tips to help you to tune the grpc++, follow the https://deeprec.readthedocs.io/zh/latest/GRPC%2B%2B.html
System information
when i used grpc++ in estimator, i got the following error,but it still training, i don't know whether it is ok
config = tf.estimator.RunConfig( save_checkpoints_secs=10 * 60, keep_checkpoint_max=2, protocol='grpc++' ) model = tf.estimator.Estimator( model_fn=model_fn, params=model_params, model_dir=checkpoint, config=config ) eval_spec = tf.estimator.EvalSpec(...) train_spec = tf.estimator.TrainSpec(...) tf.estimator.train_and_evaluate(model, train_spec, eval_spec)
In the DeepRec-doc, I found that it seems there some problem with ori-estimator,but I bazel failed and don't know what's Estimator check like when using grpc++,in the deeprec last version whether we need to install estimaotr specially?