ggerganov / llama.cpp

LLM inference in C/C++
MIT License
66.62k stars 9.58k forks source link

Server completion endpoint receive embeddings #3811

Closed Johnz86 closed 6 months ago

Johnz86 commented 12 months ago

Prerequisites

Feature Description

I propose an enhancement to allow llama.cpp server to accept document embeddings along with prompts to provide more contextual information when generating responses.

Motivation

This feature should allow to POST embeddings, along with prompt content to provide contextual information for text generation. It will enrich the context, leading to more accurate and relevant completions.

Possible Implementation

A new endpoint or an extension to the existing /completion endpoint could be created to accept embeddings. The embeddings could be sent as a separate property in the JSON request, and be utilized in the generation process to provide context.

github-actions[bot] commented 6 months ago

This issue was closed because it has been inactive for 14 days since being marked as stale.