AI-Hypercomputer / JetStream

JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).
Apache License 2.0
202 stars 26 forks source link

Add metadata metrics #77

Closed yeandy closed 5 months ago

yeandy commented 5 months ago

Add the ability to pass in additional metadata to be added to the metrics json. For example, adding hardware-related metrics (ici_fsdp_parallelism, ici_autoregressive_parallelism, ici_tensor_parallelism) will enable us to drill down performance by how we shard the model.