Open mts42000 opened 4 years ago
Our current thought on how to do this is to expose the logic in HloCostAnalysis
via the XLA Python bindings.
This would be a very useful feature, especially for neural architecture search (NAS) type of application to evaluate how many FLOPs a model uses, see https://arxiv.org/pdf/1807.11626.pdf and https://arxiv.org/pdf/1905.11946.pdf
Maybe something similar to: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/profiler/g3doc/python_api.md
I'd definitely find something like this very useful for a current project, where I'd like to compare the "compute + memory cost" of a few algorithms. Doing it by hand, but ideally if I could simply write a single function and compare across their HLOs that would make my life a lot easier :)
It would be nice to have programmatic access to profiling info (latency, flops, etc.) for various code annotated blocks like jitted functions, etc.