OvidijusParsiunas / deep-chat

Fully customizable AI chatbot component for your website
https://deepchat.dev
MIT License
1.26k stars 170 forks source link

Request: Image Data (b64) in requestInterceptor #135

Closed jackitaliano closed 3 months ago

jackitaliano commented 4 months ago

Scenario

My thought for a solution

  1. Intercept request with requestInterceptor where the base64 image data is available
  2. If image is contained in request, skip function calling and directly send image data to vision model
  3. Return requestDetails from requestInterceptor with a text description of the image

Possible alternative

My request

Reason

Use Cases

Example Psuedocode

<deep-chat
  requestInterceptor = async (requestDetails) => {

    if (requestDetails?.files && requestDetails.files.length > 0) {  // Check files are present
      const imageDescriptions = requestDetails.files.map(async (file) => {

        if (file.src) { // Check file is an image ( could be differently, this is how I've checked in `onNewMessage` ) 
          const description = await getImageDescription(file.src); // My request to vision model
          return description;
        }
      });

      const stringDescriptions = JSON.stringify(imageDescriptions);

      // Update content if exists, add content if it doesn't
      if (requestDetails?.content) {
        requestDetails.content += "Image descriptions: " + stringDescriptions;
      } else {
        requestDetails.content = "Image descriptions: " stringDescriptions;
      }

      // Remove files from request
      requestDetails.files = null;
    }

    return requestDetails;
  }
/>

Again, open to other solutions that you may have or may already exist. And again, thank you for your fast communication and fixes, it is very much appreciated.

OvidijusParsiunas commented 4 months ago

Hi @jackitaliano.

Direct Connection services are used to abstract the request logic complexity so that devs wouldn't need to worry about its internals. Based on your provided details I can see that you are looking to override some of the assistant logic. There is unfortunately no easy way to provide a partial override - especially for the OpenAI assistant API as it is the most complex one. Hence, the only solution I have for you here would be to either use the handler function to fully use your own logic or to fork/clone the project and augment the OpenAIAssistantIO class yourself. It is actually quite simple to do and the set up instructions are listed here. You can also look at the code in that class and add it to the handler instead.

Apologies for not being able to do much more as this is unfortunately too out of bounds of our existing functionality.

Let me know if you have any questions. Thankyou!

jackitaliano commented 4 months ago

@OvidijusParsiunas, I will look into both of those solutions. Thank you for the advice!