jackitaliano commented 4 months ago

Scenario

directConnection to OpenAI Assistants
attempting to upload image to vision model

My thought for a solution

Intercept request with requestInterceptor where the base64 image data is available
If image is contained in request, skip function calling and directly send image data to vision model
Return requestDetails from requestInterceptor with a text description of the image

Possible alternative

allow for returning message in onNewMessage to edit it, similar to requestInterceptor or responseInterceptor.
- this may be more difficult because given this exact way of implementing, it would break existing onNewMessage functions due to no return

My request

provide base64 image data in requestInterceptor, similar to how it is provided in onNewMessage

Reason

Only image file_id is available in requestInterceptor
Unable to pass file_id to OpenAI Vision model
Cannot retrieve user uploaded files from OpenAI by file_id for a possible workaround
base64 image data found in onNewMessage isn't very useful in this scenario because unable to change message contents here

Use Cases

Uploading images to vision model prior to sending request to Assistant
Opens other options for deep-chat consumers to handle image uploads themselves for other uses

Example Psuedocode

<deep-chat
  requestInterceptor = async (requestDetails) => {

    if (requestDetails?.files && requestDetails.files.length > 0) {  // Check files are present
      const imageDescriptions = requestDetails.files.map(async (file) => {

        if (file.src) { // Check file is an image ( could be differently, this is how I've checked in `onNewMessage` ) 
          const description = await getImageDescription(file.src); // My request to vision model
          return description;
        }
      });

      const stringDescriptions = JSON.stringify(imageDescriptions);

      // Update content if exists, add content if it doesn't
      if (requestDetails?.content) {
        requestDetails.content += "Image descriptions: " + stringDescriptions;
      } else {
        requestDetails.content = "Image descriptions: " stringDescriptions;
      }

      // Remove files from request
      requestDetails.files = null;
    }

    return requestDetails;
  }
/>

Again, open to other solutions that you may have or may already exist. And again, thank you for your fast communication and fixes, it is very much appreciated.

OvidijusParsiunas commented 4 months ago

Hi @jackitaliano.

Direct Connection services are used to abstract the request logic complexity so that devs wouldn't need to worry about its internals. Based on your provided details I can see that you are looking to override some of the assistant logic. There is unfortunately no easy way to provide a partial override - especially for the OpenAI assistant API as it is the most complex one. Hence, the only solution I have for you here would be to either use the handler function to fully use your own logic or to fork/clone the project and augment the OpenAIAssistantIO class yourself. It is actually quite simple to do and the set up instructions are listed here. You can also look at the code in that class and add it to the handler instead.

Apologies for not being able to do much more as this is unfortunately too out of bounds of our existing functionality.

Let me know if you have any questions. Thankyou!