I have tried to run run_classifer.py, it works well on GPU, then I have made a little fix, it runs well on TPU. However, when I tried to run run_squad.py, I met this bug on GPU and TPU:
Model was constructed with shape Tensor("unique_ids:0", shape=(None, 1), dtype=int32) for input (None, 1), but it was re-called on a Tensor with incompatible shape (None,).
WARNING:tensorflow:Gradients do not exist for variables ['albert_model/pooler_transform/kernel:0', 'albert_model/pooler_transform/bias:0'] when minimizing the loss.
W1211 11:49:55.828053 139666686506368 optimizer_v2.py:1043] Gradients do not exist for variables ['albert_model/pooler_transform/kernel:0', 'albert_model/pooler_transform/bias:0'] when minimizing the loss.
WARNING:tensorflow:Model was constructed with shape Tensor("unique_ids:0", shape=(None, 1), dtype=int32) for input (None, 1), but it was re-called on a Tensor with incompatible shape (None,).
W1211 11:49:59.275795 139666686506368 network.py:847] Model was constructed with shape Tensor("unique_ids:0", shape=(None, 1), dtype=int32) for input (None, 1), but it was re-called on a Tensor with incompatible shape (None,).
WARNING:tensorflow:Gradients do not exist for variables ['albert_model/pooler_transform/kernel:0', 'albert_model/pooler_transform/bias:0'] when minimizing the loss.
W1211 11:50:02.947960 139666686506368 optimizer_v2.py:1043] Gradients do not exist for variables ['albert_model/pooler_transform/kernel:0', 'albert_model/pooler_transform/bias:0'] when minimizing the loss.
@guozhiyu
You can ignore the above warnings. ALBERT model has two outputs pooler_tranformer and full sequence output. pooler transformer is not used while training SQuAD. So there will no gradients for that layer.
I have tried to run run_classifer.py, it works well on GPU, then I have made a little fix, it runs well on TPU. However, when I tried to run run_squad.py, I met this bug on GPU and TPU:
Model was constructed with shape Tensor("unique_ids:0", shape=(None, 1), dtype=int32) for input (None, 1), but it was re-called on a Tensor with incompatible shape (None,). WARNING:tensorflow:Gradients do not exist for variables ['albert_model/pooler_transform/kernel:0', 'albert_model/pooler_transform/bias:0'] when minimizing the loss. W1211 11:49:55.828053 139666686506368 optimizer_v2.py:1043] Gradients do not exist for variables ['albert_model/pooler_transform/kernel:0', 'albert_model/pooler_transform/bias:0'] when minimizing the loss. WARNING:tensorflow:Model was constructed with shape Tensor("unique_ids:0", shape=(None, 1), dtype=int32) for input (None, 1), but it was re-called on a Tensor with incompatible shape (None,). W1211 11:49:59.275795 139666686506368 network.py:847] Model was constructed with shape Tensor("unique_ids:0", shape=(None, 1), dtype=int32) for input (None, 1), but it was re-called on a Tensor with incompatible shape (None,). WARNING:tensorflow:Gradients do not exist for variables ['albert_model/pooler_transform/kernel:0', 'albert_model/pooler_transform/bias:0'] when minimizing the loss. W1211 11:50:02.947960 139666686506368 optimizer_v2.py:1043] Gradients do not exist for variables ['albert_model/pooler_transform/kernel:0', 'albert_model/pooler_transform/bias:0'] when minimizing the loss.