Open othakkar opened 3 days ago
Upon running the same model with the same steps on a CPU, I found that the program takes an unreasonable amount of time (> 1 day) to finish when running with XLA, whereas it finishes running in < 1 min without XLA.
XLA_FLAGS=--xla_gpu_shape_checks=none
and check the result, but the underlying issue here is in the TF2XLA lowering logic, not in XLA itself.@cheshire thanks for your response. FYI, just a minor correction to the flag: Setting XLA_FLAGS='--xla_gpu_shape_checks="IGNORE"'
worked on GPU.
It could be hiding bugs - I'd double check the numerics vs. non-XLA case.
I'm running the
faster_rcnn_inception_resnet_v2_atrous_coco
model from TF official objection detection model zoo and I see the following error when running it with XLA enabled using the TF-XLA flags (TF_XLA_FLAGS='--tf_xla_auto_jit=1 --tf_xla_cpu_global_jit'
):Note that the model runs fine without enabling XLA.
Steps to reproduce:
faster_rcnn_inception_resnet_v2_atrous_coco
model from TF official models - http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_resnet_v2_atrous_coco_2018_01_28.tar.gzfrozen_inference_graph.pb
file.export TF_XLA_FLAGS='--tf_xla_auto_jit=1 --tf_xla_cpu_global_jit'