-
### Question
Does this model support chinese language?
-
### Describe the issue
I am trying to run the [phi-3-v-128k-instruct-vision.onnx](https://huggingface.co/microsoft/Phi-3-vision-128k-instruct-onnx-directml/blob/main/directml-int4-rtn-block-32/phi-…
-
How about building a Extensions for GenAI Stack?
-
Comparing performance from 0.3.0rc2 to the current release 0.3.0, there seems to be a 7-10% drop in performance for a token generation when using the benchmark script (benchmark.py).
Profiling indic…
-
The rpm value set in the command is not honoured and we get inconsistent rpm from the benchmarking tool using --rate paramater. eg, if we set --rate 10, we get RPM of over 20 as well.
python -m ben…
-
Awsome concept!
However, since many questions are based on mnemonic information, I would like to be able **to declare** and partecipate to the challenge with access to Google and my books, i.e., non-…
-
**Output of 'strings libarm_compute.so | grep arm_compute_version':**
arm_compute_version=v23.11 Build options: {'Werror': '0', 'debug': '0', 'neon': '1', 'opencl': '0', 'embed_kernels': '0', 'os…
-
**Describe the package you'd like added**
`llama.cpp` has become a popular inference server for LLMs. Additionally, `llama-cpp-python` is commonly used to connect from Python to `llama.cpp`.
- `l…
-
### Describe the issue
In my project, I am using the autogen framework.
By the design of the project, I must use tools. However, unfortunately, they are not being called, when used with gemini model…
-