-
Is there a specific version of openai that is aligned with the OpenAI interfaces offered by neuralchat? I am currently testing using the current **1.12.0** but encountering a **422 Unprocessable Entit…
-
```python
from transformers import AutoTokenizer, TextStreamer
from intel_extension_for_transformers.transformers import AutoModelForCausalLM, WeightOnlyQuantConfig
model_name = "Intel/neural-chat-…
-
Is there a specific version of openai that is aligned with the OpenAI interfaces offered by neuralchat? I am currently testing using the current **1.12.0** but encountering a **422 Unprocessable Entit…
-
Hi all,
I'm attempting to follow the SmoothQuant tutorial for the LLAMA2-7b model: [https://github.com/intel/neural-compressor/tree/master/examples/onnxrt/nlp/huggingface_model/text_generation/llam…
-
Is there a specific version of openai that is aligned with the OpenAI interfaces offered by neuralchat? I am currently testing using the current **1.12.0** but encountering a **422 Unprocessable Entit…
-
We have seen a significant difference in performance drop with the env created with the latest repo for vllm serving for the neural-chat model as compared to the old env built with the old repo. With …
-
Congrats on Flash Attention in the latest version, or to be precise, in having your storage limit increased on Pypi.org so you could upload the release that was weeks ago. Here are some benchmarks fo…
-
Hello everyone, we are seeing slower than expected inference times on one of our CPU node with Intel(R) Xeon(R) Platinum 8362 CPU @ 2.80GHz with following instruction sets:
```
fpu vme de pse tsc…
-
### Priority
P3-Medium
### OS type
Ubuntu
### Hardware type
AI-PC (Please let us know in description)
### Running nodes
Single Node
### Description
As AI PC or OPEA developer, I want to deplo…
-
### Describe the bug
When attempting to run "interpreter --local" and choosing jan.ai as the llm provider, the model choice function crashes interpreter.
LM_Studio runs as expected. (I'm assumi…