Adding more metrics to MLServer Prometheus endpoint

The currently implemented metrics in MLServer are all around pure count of the number of requests:

mlserver-metrics

Compared with similar platforms like Triton Server many other metrics could be added some of them are of high importance for performance monitoring. E.g. metrics for latency monitoring of the model, queuing time of the batch, and GPU usage are some of the metrics that are specific to the model server and are impossible to calculate using other scrappers like CAdvisor. The hack I'm currently using for extracting such metrics inside my containers is to measure and report them as part of some logs in my request response which is not ideal.

SeldonIO / MLServer

Adding more metrics to MLServer Prometheus endpoint #805