-
### Description
I’m experiencing issues when trying to connect CREW AI with Azure OpenAI. After some investigation, I found that only version 0.11 works with Azure OpenAI, but unfortunately, this v…
-
### Description
The cohere rerank implementation allows configuring fields that probably don't apply. The implementation leverages the common settings here: https://github.com/elastic/elasticsearch/b…
-
Hi,
When I run the exo command on mac and start the inferences using the completion REST API endpoint, the Python process seems to be increasingly using more and more memory.
I have put a delay…
-
Test out with various apis and figure out the best API that is accurate with minimal latency for English and integrate them.
Separate issue will be created for in-house ASR en models hosted with in…
-
### Elasticsearch Version
serverless
### Installed Plugins
_No response_
### Java Version
_bundled_
### OS Version
N/A
### Problem Description
When trying to create an inference endpoint usin…
-
So far I've ported the following models to Java:
Llama 3 & 3.1, Mistral/Codestral/Mathstral/Nemostral (+ Tekken tokenizer), Qwen2, Phi3 and Gemma 1 & 2 ...
All models are bundled as a single ~2K li…
mukel updated
2 weeks ago
-
Hi guys,
If I simply install the lib with "pip install timesfm" and try the example code described in https://huggingface.co/google/timesfm-1.0-200m:
```
import timesfm
tfm = timesfm.TimesFm…
-
### Describe the issue
According to [TensorRT EP docs](https://onnxruntime.ai/docs/execution-providers/TensorRT-ExecutionProvider.html) one should do symbolic shape inference before executing the mod…
-
**Describe the bug**
I set up a proxy in the container where Flowise is deployed, but Flowise still times out when accessing huggingface inference api. How to solve this problem?
**To Reproduce**
…
-
Running Jupyter notebook of llava model:
https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/llava-multimodal-chatbot/llava-multimodal-chatbot-genai.ipynb
- Device: Arc 770 d…