nyu-dl / dl4marco-bert

BSD 3-Clause "New" or "Revised" License
476 stars 87 forks source link

pre-trained MARCO model cannot be loaded properly? #3

Closed Punchwes closed 5 years ago

Punchwes commented 5 years ago

Hello,

I am trying to do some inference test by using your pre-trained model, but unfortunately it seems that I could not load it on my own computer.

I am using a Macbook Pro and are only CPU-based. When loading the checkpoint model, it keeps giving me the error message:

InvalidArgumentError: Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

No OpKernel was registered to support Op 'InfeedEnqueueTuple' with these attrs.  Registered devices: [CPU], Registered kernels:
  <no registered kernels>

I guess it might be that the trained model is based on TPU and some Op are not available on CPU.

So, is it true that the model could not be loaded by a CPU-only or GPU machine and could only be used on TPU?

Thanks in advance.

rodrigonogueira4 commented 5 years ago

It should work both on CPU and GPU without any extra code. I made this colab that does eval on CPU: https://colab.research.google.com/drive/1w2npvlSeiOvop8hR9LblbhxksSgpl4zF

Note that USE_TPU=False and INIT_CHECKPOINT points to a pretrained BERT.

Are using a different input_fn, by any chance?

Punchwes commented 5 years ago

Thanks so much for your quick reply, it works very well using the code you provided.

I previously was using another way to load the model rather than directly defining an estimator. Here is my previous loading code:

new_saver = tf.train.import_meta_graph('BERT_Large_trained_on_MSMARCO/model.ckpt-100000.meta')
with tf.Session() as sess:
    new_saver.restore(sess, 'BERT_Large_trained_on_MSMARCO/model.ckpt-100000.data-00000-of-00001')

Then it gave the InvalidArgumentError message I mentioned above.

rodrigonogueira4 commented 5 years ago

I'm glad to know that code in the colab worked for you!