This PR enables metrics by switching HF TGI to use port 3000. Currently a Service and ServiceMonitor is created for the IBM/TGIS runtime which exposes metrics on port 3000. We can make Prometheus scrape HF TGI metrics by using the same port in this custom runtime.
Also added a note about safetensor format in the README, and added some ENV vars to avoid silence some warning logs.
This PR enables metrics by switching HF TGI to use port 3000. Currently a Service and ServiceMonitor is created for the IBM/TGIS runtime which exposes metrics on port 3000. We can make Prometheus scrape HF TGI metrics by using the same port in this custom runtime.
Also added a note about safetensor format in the README, and added some ENV vars to avoid silence some warning logs.