arshad-yaseen / monacopilot

⚡️AI auto-completion plugin for Monaco Editor, inspired by GitHub Copilot.
MIT License
64 stars 7 forks source link

Add abort controller option to pass in along with streaming requests #83

Open leftmove opened 5 days ago

leftmove commented 5 days ago

Right now, there is no way to abort LLM requests while they are streaming. I would like to create a system that allows you to to do that.

This would have a couple benefits from this.

If implemented properly, this is what I imagine route.ts for minimal Next.js example to look like.

import {NextRequest, NextResponse} from 'next/server';

import cache from "memory-cache"; // In-memory cache

import {Copilot, type CompletionRequestBody} from 'monacopilot';

const copilot = new Copilot(process.env.ANTHROPIC_API_KEY!);

export async function POST(req: NextRequest) {

  const completionId = JSON.stringify(req)
  const abort = () => cache.get("completion") !== completionId
  cache.put("completion", completionId)

  const body: CompletionRequestBody = await req.json();
  const {completion, error} = await copilot.complete({
    body,
    options: {
        abort       
    }
  });

  if (error) {
    // Handle error if needed
    // ...
    return NextResponse.json({completion: null, error}, {status: 500});
  }

  return NextResponse.json({completion}, {status: 200});
}

Or possibly, you could pass in your own AbortController. There is already some code for this in the source - it's just not possible to pass in a signal yet.

const request = async <
  ResponseType,
  BodyType = undefined,
  MethodType extends Method = Method,
>(
  url: string,
  method: MethodType,
  options: RequestOptions<BodyType, MethodType> = {},
): Promise<ResponseType> => {
  const headers = {
    'Content-Type': 'application/json',
    ...options.headers,
  };

  const body =
    method === 'POST' && options.body
      ? JSON.stringify(options.body)
      : undefined;

  const response = await fetch(url, {
    method: method,
    headers,
    body,
    signal: options.signal, // Abort signal
  });

  if (!response.ok) {
    const data = '\n' + (JSON.stringify(await response.json(), null, 2) || '');
    throw new Error(
      `${response.statusText || options.fallbackError || 'Network error'}${data}`,
    );
  }

  return response.json() as Promise<ResponseType>;
};

What makes this fairly trivial to implement versus a major refactor, is that you must stream responses in order to cancel them. Currently, all LLM responses are returned from an API call statically. In order to stop LLM generation and not incur costs, you have to stream responses.

Therefore, you would have to implement a way to stream and conjoin every completion, instead of the current method of calling a request and storing the output. Here's some code to outline what I mean.


// Instead of this

const response = await request("https://api.openai.com/v1/chat/completions")

// To avoid token costs, you would have to do this (keep in mind this is pseduo-code)

let response = ""
await fetch("https://api.openai.com/v1/chat/completions/stream").then((response) => {
    if (!response.ok) {
      throw new Error(`HTTP error! status: ${response.status}`);
    }
    return response.body.getReader();
  })
  .then((reader) => {
    const readChunk = () => {
      reader.read().then(({ done, value }) => {
        if (done) return;
        const chunk = new TextDecoder('utf-8').decode(value);
        const completion = JSON.parse(chunk)
        const content = completion.choices[0].text;

        response += content
        abort() // Abort condition from all the way above

        readChunk();
      });
    };

I want to undertake changing the code so this feature is possible, but I wanted to ask if I should beforehand since it would require a major refactor.

Thanks!

arshad-yaseen commented 4 days ago

I understand your point. I will review the implementation to assess how it works and whether there are any consequences, considering that the onTyping real-time completions rely on previous requests completion cache. I will let you know.