-
## Introduction
The Chrome local API with Gemini Nano support opens the door to doing Retrieval Augmented Generation completely client side in the browser. However, to achieve that goal completely…
-
### I tried this:
We have like
- 30 domains
- 1000 services
- 10000 events
to be generated and linked (events add to services, services add to domains)
Eventcatalog version 2.8.12
SDK ver…
-
It would be nice to be able to get the mermaid diagram string for any SPARQL query from client-side apps (e.g the sparql-editor, or have a VSCode extension for that)
So that we can easily visualize…
-
Repair/streaming writes one sstable for each vnode x shard. This can lead to rapid accumulation of small sstable files. To mitigate this, we have an off-strategy compaction at the end of repair/stream…
-
### Clear and concise description of the problem
## What is this about
We are currently trying to normalize all language client libraries for Azure OpenAI services to use the same TypeSpec definitio…
-
### Your current environment
I am sending requests at the same time such as benchmark_serving to two services:
Number of requests: 400
### 🐛 Describe the bug
I am sending requests at the same ti…
-
### Your current environment
I am testing completion and chat completion APIs, and both return empty output `''` as output from the model and on the service side, it print zeros for the number of gen…
-
We have some HelmRelease objects which are stuck in state: "reconciliation in progress".
The helm release itself looks fine, we just need to reset the state or refresh it... deleting this object wo…
-
I rewrote parts of the connector to use some open-source LLM hosting services. Often, the inference speed is not high, and the time required for generating responses + TTS exceeds 30 seconds.
When …
-
I don't know why but I'm encountering this problem with the library. Here I show my simple script:
```python
import ollama
client = ollama.Client(host=llm_config["base_url"], timeout=600)
clie…