Prerequisites

[x] I am running the latest code.
[x] I carefully followed the README.md.
[x] I searched using keywords relevant to my issue and found no existing open or closed issues.
[x] I reviewed the Discussions and decided to open a new issue.

Feature Description

I propose an enhancement to allow llama.cpp server to accept document embeddings along with prompts to provide more contextual information when generating responses.

Motivation

This feature should allow to POST embeddings, along with prompt content to provide contextual information for text generation. It will enrich the context, leading to more accurate and relevant completions.

Possible Implementation

A new endpoint or an extension to the existing /completion endpoint could be created to accept embeddings. The embeddings could be sent as a separate property in the JSON request, and be utilized in the generation process to provide context.

ggerganov / llama.cpp

Server completion endpoint receive embeddings #3811

Prerequisites

Feature Description

Motivation

Possible Implementation