Open ddkang opened 6 years ago
This op is using DT_BFLOAT16. DT_BFLOAT16 is supported on TPUs, but is not supported on CPUs or GPUs.
Use the version of non-BFLOAT16 version of ResNet.
Yes, I understand that DT_BFLOAT16 is not supported on CPUs or GPUs. Are you telling me that once I train a model on the TPU that is can't be used elsewhere?
You could do two things.
First, you could modify your model so that it performs its computations in BFLOAT16, but converts BFLOAT16 to FLOAT32 before storing the tensor in the checkpoint.
Second, you could write a custom app that reads the checkpoint, locates the BFLOAT16 tensors, converts them to FLOAT32, and exports a new checkpoint.
As far as I know this is impossible in TensorFlow. If this is incorrect or outdated, please let me know how.
You can create a variables scope with custom_getter to cast any bp16 variables to fp32 when used.
First you need to build your network entirely inside the bfloat16_scope
:
Note that this bfloat16_scope
is just wrapping the custom_getter logic
mentioned above, and you can check this out by visiting the module where it is declared. You also need to ensure that the output tensor from your network is cast back to fp32.
Then you need to have a place in your code that will call export_saved_model on your estimator:
Notice this isn't enough, because there is a requirement to have some sort of serving_input_receiver_fn
as an argument to that method.
This is where things get more custom to the model you are running.
You need to supply a function that takes no arguments and when called, yields a ServingInputReceiver
. In the ResNet-50 example, this is done here:
The two arguments of ServingInputReceiver
in this example are:
In essence, this ServingInputReceiver
is primarily about setting up a subgraph whose purpose is preprocessing: its inputs are features in a format that the server is going to accept from clients, and its outputs are features that the estimator model can use for prediction.
https://github.com/tensorflow/tpu/tree/master/models/experimental/resnet_bfloat16
The link above says "To run the same code on CPU/GPU, set the flag --use_tpu=False" but after training on the TPU, evaluation and prediction fails with the error