-
Hello,
I am seeking advice on the best practices for tracking all inputs and predictions made by a model when using Triton Inference Server. Specifically, I would like to track every interaction th…
-
Hello,
I am currently using **nnUNet v2** for inference and would like to know if there is a way to perform **inference on the fly** instead of relying on **files**. My goal is to directly input da…
-
### Willingness to contribute
No. I cannot contribute this feature at this time.
### Proposal Summary
For certain requirements, the latency of the model REST APIs have to be very low. We have obser…
-
Here I want to brainstorm a list to what are all the potential threats (i.e., where can things go wrong) to a machine learning project? Our checklist need not address all of them, but we should in our…
-
is it possible to run sed baseline in causal mode? i would like to use it on an audio stream to detect certain audio cues in a noisy environment.
-
**Is your feature request related to a problem? Please describe.**
Rust isn't deterministic across platforms, and neither are most of rust game dev libs.
**Describe the solution you'd like**
Mak…
-
### 🚀 The feature
TorchServe supports streaming response for both HTTP and GRPC endpoint.
- [ ] #2186
- [ ] #2232
### Motivation, pitch
Usually the predication latency is high (eg. 5sec…
-
Currently, the frontend-exporter just flatten the metric json and log each sub-metric as guage. We need to use Prometheus label system for our metrics.
For example,
```json
{
…
-
**Is your feature request related to a problem?**
Two enhancements are proposed in this feature to improve the Ml-Commons Connector framework.
1. Currently in the connector framework, we only ha…
-
在 5.1 Performance Prediction Model中,提到的GFLOPS起到什么作用?模型的输入时各种层的配置参数,输出是层的执行时间吗?
5.1 Performance Prediction Model
Neurosurgeon models the per-layer latency and the energy consumption of arbitrary ne…