theodo-group / LLPhant

LLPhant - A comprehensive PHP Generative AI Framework using OpenAI GPT 4. Inspired by Langchain
MIT License
894 stars 94 forks source link

For Image-To-Text Message::$content property must allow array data type #225

Open prykris opened 2 months ago

prykris commented 2 months ago

According to OpenAI, their message's content property can also be an array. Are there any plans for handling different data types for this property? I must initiate a new instance without the wrapper to access the functionality as a workaround.

response = client.chat.completions.create(
  model="gpt-4o-mini",
  messages=[
    {
      "role": "user",
      "content": [
        {"type": "text", "text": "What’s in this image?"},
        {
          "type": "image_url",
          "image_url": {
            "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
          },
        },
      ],
    }
  ],
  max_tokens=300,
)
MaximeThoonsen commented 2 months ago

Hey @prykris, we should add it yes. Do you want to work on it?

prykris commented 2 months ago

Hey @prykris, we should add it yes. Do you want to work on it?

I will conjure up something if no one beats me to it.

prykris commented 1 month ago

This is something that I currently use. It's not perfect when it comes to multiple images, but does what I need to.

<?php

declare(strict_types=1);

namespace App\LLM\Chat\Message;

use InvalidArgumentException;
use JsonSerializable;
use LLPhant\Chat\Enums\ChatRole;

class Vision extends Message implements JsonSerializable
{
    public array $images = [];

    public static function describe(string|array $images, ?string $message = null): self
    {
        $instance = new self;
        $instance->role = ChatRole::User;
        $instance->content = $message ?? 'Describe the image' . (is_array($images) && count($images) > 1 ? 's' : '');

        if (is_string($images)) {
            // If a single image URL/base64 string is provided
            $instance->describeSingle($images);
        } elseif (isset($images['image_url'])) {
            // If an associative array with details is provided
            $instance->describeWithDetails($images);
        } elseif (is_array($images) && isset($images[0]) && is_array($images[0])) {
            // If an array of arrays is provided
            $instance->describeMultiple($images);
        } else {
            throw new InvalidArgumentException("Invalid input format for describe method.");
        }

        $instance->content .= "; Output must contain no other URL than the input image url";

        return $instance;
    }

    protected function describeSingle(string $image): void
    {
        $type = $this->isUrl($image) ? 'image_url' : 'image_base64';
        $this->images[] = [
            'type' => $type,
            $type => [
                'url' => $this->isUrl($image) ? $image : 'data:image/jpeg;base64,' . $image
            ]
        ];
    }

    protected function describeWithDetails(array $imageDetails): void
    {
        $url = $imageDetails['image_url'];
        $detail = $imageDetails['detail'] ?? 'auto';

        if (!$this->isUrl($url) && !$this->isBase64($url)) {
            throw new InvalidArgumentException("Invalid image URL or base64 format.");
        }

        // TODO: Fix base64 encoded image
        $type = $this->isUrl($url) ? 'image_url' : 'image_url';
        $this->images[] = [
            'type' => 'image_url',
            $type => [
                'url' => $this->isUrl($url) ? $url : 'data:image/jpeg;base64,' . $url,
                'detail' => $detail
            ]
        ];
    }

    protected function describeMultiple(array $images): void
    {
        foreach ($images as $imageDetails) {
            if (!isset($imageDetails['image_url'])) {
                throw new InvalidArgumentException("Each image array must contain an 'image_url' key.");
            }
            $this->describeWithDetails($imageDetails);
        }
    }

    protected function isUrl(string $image): bool
    {
        return filter_var($image, FILTER_VALIDATE_URL) !== false;
    }

    protected function isBase64(string $image): bool
    {
        return preg_match('/^[a-zA-Z0-9\/\r\n+]*={0,2}$/', $image) === 1;
    }

    public function jsonSerialize(): array
    {
        return [
            'role' => $this->role,
            'content' => array_merge(
                [['type' => 'text', 'text' => $this->content]],
                $this->images
            ),
        ];
    }
}

No time to actually make a PR. And to be fair, this implementation is hacky at best

f-lombardo commented 2 weeks ago

@MaximeThoonsen I think this issue can be closed