Fermain / -mollify

9 stars 9 forks source link

Update Molly backend with correct endpoint #123

Closed Fermain closed 10 months ago

Fermain commented 12 months ago

The current Molly backend implementation was written very quickly as an MVP and does not tolerate long contexts. This breaks it for long lessons.

Solution:

This sounds hard but I have working code to reference.

// src/lib/stores/readableStream

import { writable } from "svelte/store";

export function readableStreamStore() {
    const { subscribe, set, update } = writable({ loading: false, text: "" });

    async function request(request: Request) {
        set({ loading: true, text: "" });

        try {
            const response = await fetch(request);

            if (!response.ok) throw new Error(response.statusText);
            if (!response.body) return;

            const reader = response.body.pipeThrough(new TextDecoderStream()).getReader();

            let result = "";
            while (true) {
                const { value: token, done } = await reader.read();

                if (token != undefined) update((val) => {
                    result = val.text + token;
                    return ({ loading: true, text: result });
                });
                if (done) break;
            }

            set({ loading: false, text: "" });

            return result;
        } catch (err: any) {
            set({ loading: false, text: err.toString() });
            throw err;
        }
    }

    return { subscribe, request };
}
// api/chat/+server.ts

import { OPENAI_API_KEY } from "$env/static/private";

import { CallbackManager } from "langchain/callbacks";
import { ChatOpenAI } from "langchain/chat_models/openai";
import { HumanChatMessage, SystemChatMessage } from "langchain/schema";
import { error, type RequestHandler } from '@sveltejs/kit';

export const POST: RequestHandler = async ({ request }) => {
    const body = await request.json();

    // TODO: Request Validation

    // Create a new readable stream of the chat response
    const readableStream = new ReadableStream({
        async start(controller) {
            const chat = new ChatOpenAI({
                openAIApiKey: OPENAI_API_KEY,
                modelName: "gpt-4",
                streaming: true,
                callbackManager: CallbackManager.fromHandlers({
                    handleLLMNewToken: async (token: string) => controller.enqueue(token),
                }),
            });

            await chat.call([
                new SystemChatMessage("System prompt"),
                // ...history,
                new HumanChatMessage(body.message)
            ]);

            controller.close();
        },
    });

    // Create and return a response of the readable stream
    return new Response(readableStream, {
        headers: { 'Content-Type': 'text/plain' },
    });
}
import { readableStreamStore } from '$lib/stores/readableStream';

const response = readableStreamStore();
let sent = false;
let reply = '';

async function onSubmit(event: SubmitEvent) {
    const form = event.target as HTMLFormElement;
    const formData = new FormData(form);
        const body = Object.fromEntries(formData.entries());

    try {
        const promiseReply = response.request(
            new Request('/api/chat', {
                method: 'POST',
                body
            })
        );

        sent = true;

        reply = (await promiseReply) || '';
    } catch (err) {
        alert(err);
    }
}
StianSto commented 11 months ago

Hey @Fermain or @ShaindalDev, you can add me as one of the assignees on this issue :) also, will i need an API key to use molly? it is probably useful to have for testing

StianSto commented 11 months ago

im somewhat new to readable stream, so correct me if im missing the target here. do we need a readableStream store with the current setup? right now stream is enabled and the front end is using EventSource to recieve data from the server.

openai/index.js:94 molly.svelte:32 - 56

Fermain commented 11 months ago

EventSource is limited (can only use GET requests) and there is a limited length to a GET request. Using this approach with the readablestream store will allow for us to use POST requests with a longer body, i.e. more lesson context.

StianSto commented 11 months ago

the backend is implemented and working, but i still need to test larger documents. How many tokens did it used to break at? i also need to add a chat history, shouldn't be too hard, but if anyone has any ideas im all ears.

Fermain commented 11 months ago

For chat history something like a localstorage store would be the most sveltey way to do it.

Could use a package or construct a store file yourself: https://www.npmjs.com/package/svelte-local-storage-store

I don't have exact figures for when it used to break but I have a test lesson that fails, you can see it live here: https://mollify.noroff.dev/content/feu1/design/module-1/visual-hierarchy

Failed to load resource: the server responded with a status of 431 ()
StianSto commented 11 months ago

i see, but i fear that it ends up being a lot of tokens very fast after a session of conversation, costing a lot of extra. im going to research langchain possibilities of memory first, but if i come up empty ill add a store for now :)