-
### Feature request
This is a Bert based model however when trying to run, the message says model not supported. https://huggingface.co/meta-llama/Prompt-Guard-86M/tree/main
### Motivation
LLM-pow…
-
I have experimented multiple models with ARM-NN on Cortex A53(mostly int8 quantized models with latency < 200ms). And I found XNNPACK generally gives a better latency result than ARM-NN. So I am tryin…
-
### OpenVINO Version
https://github.com/openvinotoolkit/openvino/tree/2d8ac08bf1f87f8ac455eae381213b52e781fe8c
### Operating System
Windows System
### Device used for inference
CPU
### Framework…
-
It looks like inference is not working for non-tree structures.
For example consider the following simple factor graph with nodes x1, x2, x3 and factors fa, fb, fc.
```python
from fglib import grap…
-
It is expected that (at least) 3 openEO processes will become available which should be integrated into GTIF.
The integration concept would be similar to the one done for ship detections in the RACE …
-
### Describe the issue
Issue:
How to run inference for llava-next-72b/llava-next-110b?
There are too many versions of your llava, and it seems that the code is not compatible, and there are mul…
-
# Bug Report
Iam referring to [https://github.com/microsoft/onnxruntime-inference-examples/tree/main/quantization/language_model/llama/smooth_quant](https://github.com/microsoft/onnxruntime-inference…
-
### Motivation
The latest release of microsoft phi3 4.2b 128k context vision model looks promising in performance and resource saving one too as it boast just 4.2b parameter. So it would be a great f…
-
I`m not quite familiar with the Transformer model. There are more steps to do than other model with the Encoder and Decoder. Such as the last encoder block output needs to be as the input for the nex…
-
Hello mlcommons team,
I want to run the "Automated command to run the benchmark via MLCommons CM" (from the example: https://github.com/mlcommons/inference/tree/master/language/llama2-70b), but I d…