explainers-by-googlers / prompt-api

A proposal for a web API for prompting browser-provided language models
Creative Commons Attribution 4.0 International
267 stars 20 forks source link

Feature request: Stop sequence support #44

Open jacoblee93 opened 1 month ago

jacoblee93 commented 1 month ago

It would be nice in certain situations to be able to pass a list of stop sequences as input:

For example, see: https://platform.openai.com/docs/api-reference/chat/create#chat-create-stop

When the model would generate a sequence like this, it would instead stop and return the current generation to the user. This more easily allows for more advanced prompting techniques like LangChain's original text-based ReAct agent loop, where we want the model to not generate an Observation: and instead have it populated by an external tool call.

We can get around this by streaming a generation and cancelling the request when we detect a stop sequence in the accumulated output, but this is a bit less nice.

tomayac commented 1 month ago

People discovering this issue via search, the client-side workaround is as follows:

const abortController = new AbortController();
const signal = abortController.signal;

const STOP_SEQUENCES = ['world'];

const assistant = await self.ai.assistant.create();
const stream = await assistant.promptStreaming('Say "Hello, world!"', {
  signal,
});

let previousLength = 0;
streamingLoop: for await (const chunk of stream) {
  const newContent = chunk.slice(previousLength);
  console.log(`Chunk: "${newContent}"`);
  for (const stopSequence of STOP_SEQUENCES) {
    if (newContent.toLowerCase().includes(stopSequence.toLowerCase())) {
      console.log(
        `Stop sequence "${stopSequence}" found in chunk "${newContent}". Aborting.`
      );
      abortController.abort();
      break streamingLoop;
    }
  }
  document.body.insertAdjacentText('beforeEnd', newContent);
  previousLength = chunk.length;
}

You can see this pattern in action in this demo. I have also added this tip to the documentation.