Open jacoblee93 opened 1 month ago
People discovering this issue via search, the client-side workaround is as follows:
const abortController = new AbortController();
const signal = abortController.signal;
const STOP_SEQUENCES = ['world'];
const assistant = await self.ai.assistant.create();
const stream = await assistant.promptStreaming('Say "Hello, world!"', {
signal,
});
let previousLength = 0;
streamingLoop: for await (const chunk of stream) {
const newContent = chunk.slice(previousLength);
console.log(`Chunk: "${newContent}"`);
for (const stopSequence of STOP_SEQUENCES) {
if (newContent.toLowerCase().includes(stopSequence.toLowerCase())) {
console.log(
`Stop sequence "${stopSequence}" found in chunk "${newContent}". Aborting.`
);
abortController.abort();
break streamingLoop;
}
}
document.body.insertAdjacentText('beforeEnd', newContent);
previousLength = chunk.length;
}
You can see this pattern in action in this demo. I have also added this tip to the documentation.
It would be nice in certain situations to be able to pass a list of stop sequences as input:
For example, see: https://platform.openai.com/docs/api-reference/chat/create#chat-create-stop
When the model would generate a sequence like this, it would instead stop and return the current generation to the user. This more easily allows for more advanced prompting techniques like LangChain's original text-based ReAct agent loop, where we want the model to not generate an
Observation:
and instead have it populated by an external tool call.We can get around this by streaming a generation and cancelling the request when we detect a stop sequence in the accumulated output, but this is a bit less nice.