Closed henryfriedlander closed 5 years ago
I was able to solve the problem. It is a matter of the way that Tensorflow stores LOCAL_VARIABLES
. The bug is that when you save/restore a model the tf.Saver
class does not save local variables (here and here are relevant SO posts). A quick solution would be to include model.session.run(tf.local_variables_initializer())
before the line model.evaluate(test_batch_generator, evaluator)
. I have submitted a pull request at #31 with the change.
However, this change is less than ideal. I have noticed the local variables regarding loss are all exactly the same for each model. I would propose that you move the loss code into the base_model.py
's _build_graph
function. Then you can initialize the local variables directly inside base_model.py
's load
function to abstract that perhaps unintuitive line from the user.
Hi,
Thank you very much for your code. I have been able to replicate your results for many on datasets using the
model.train_and_evaluate()
method. However, when I have tried to save and load a model, I have experienced an error. Initially I tried to save and evaluate using the BertCoQA model, but I am even experiencing errors when running the code frommodel_save_load.md
tutorial.Below is the error thrown (here is a pastebin with the full error if that would be helpful).
FailedPreconditionError (see above for traceback): Attempting to use uninitialized value eval_metrics/mean/count [[node eval_metrics/mean/AssignAdd_1 (defined at /juicier/scr126/scr/hnf035/fresh/SMRCToolkit/sogou_mrc/model/bert_coqa.py:199) = AssignAdd[T=DT_FLOAT, use_locking=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](eval_metrics/mean/count, eval_metrics/mean/ToFloat, ^add_8)]]
Thank you very much for the help!