yzwijsen / chatgpt-powershell

example on how to use the chatGPT api with powershell
MIT License
62 stars 3 forks source link

OneShot: allow passing a file #6

Open dewi-ny-je opened 1 week ago

dewi-ny-je commented 1 week ago

I'd like to query chatGPT and to upload, after my query, a PDF file as part of the request. Do the API allow this? can you maybe add another script for the purpose?

Like: -> please find in the following PDF I'm uploading all occurrences of the following concept, and point me to the pages and lines where they appear. Concept: xxxx. [upload the file]

Thanks

dewi-ny-je commented 1 week ago

I asked ChatGPT.

He provided a modified script first, improving yours by correcting a typo (userInput vs UserInput) and by adding try/catch.

# Function to send a message to ChatGPT
function Invoke-ChatGPT ($message) {
    try {
        ...
    } catch {
        Write-Error "An error occurred: $_"
        return "An error occurred while trying to communicate with the AI. Please check your API key and connection."
    }
}

however it states that gpt-3.5-turbo does not allow file upload. gpt-4o does.

Suggestions:

$file = client.files.create(
  file=open("myfile.pdf", "rb"),
  purpose="assistant"
)

and then

$response = client.messages.create(
  thread_id=thread.id,
  role="user",
  content="Can you summarize the content of this file?",
  file_ids=[file.id]
)

Honestly I haven't tried it yet, I just asked ChatGPt out of curiosity. I'll try another day but maybe this speeds up your task, if you are interested in providing a gpt-4o script with file upload.

dewi-ny-je commented 1 week ago

I realised now that for API use a separate billing is needed. I won't be able to test it. I provide the version adapted to Asistant API.

<#
This script will operate on text and PDF files.
#>

# Parse command-line arguments
param (
    [Parameter(Mandatory=$true)][string]$pdfPath
)

# Define API key and endpoint
$ApiKey = "api_key"
$ApiUploadEndpoint = "https://api.openai.com/v1/files"
$ApiAssistantEndpoint = "https://api.openai.com/v1/assistants/{assistant_id}/messages"

<#
System message.
You can use this to give the AI instructions on what to do, how to act, or how to respond to future prompts.
#>
$AiSystemMessage = "This is my request. Do something with the attached PDF file."

# Function to upload the PDF to OpenAI
function Upload-PDF ($pdfPath) {
    try {
        # Ensure the PDF exists
        if (-not (Test-Path $pdfPath)) {
            throw "The specified PDF file does not exist."
        }

        # Upload the PDF file
        $headers = @{
            "Authorization" = "Bearer $ApiKey"
        }
        $fileContent = [System.IO.File]::ReadAllBytes($pdfPath)
        $formData = @{
            file = @{
                value = $fileContent
                filename = [System.IO.Path]::GetFileName($pdfPath)
                contentType = 'application/pdf'
            }
            purpose = "assistant"
        }

        $response = Invoke-RestMethod -Method POST -Uri $ApiUploadEndpoint -Headers $headers -Form $formData

        # Return the file_id of the uploaded PDF
        return $response.id
    } catch {
        Write-Error "Failed to upload PDF: $_.Exception.Message"
        return $null
    }
}

# Function to send a message to GPT-4o Assistant API
function Invoke-AssistantGPT ($message, $fileId) {
    try {
        # Set the request headers
        $headers = @{
            "Content-Type"  = "application/json"
            "Authorization" = "Bearer $ApiKey"
        }

        # Set the request body
        $requestBody = @{
            "messages" = @(
                @{
                    "role"    = "system"
                    "content" = $AiSystemMessage
                },
                @{
                    "role"    = "user"
                    "content" = $message
                    "file_ids" = @($fileId)
                }
            )
            "max_tokens" = 2000
            "temperature" = 0.5
        }

        # Send the request to the Assistant API
        $response = Invoke-RestMethod -Method POST -Uri $ApiAssistantEndpoint -Headers $headers -Body (ConvertTo-Json $requestBody)

        # Return the Assistant's response
        return $response.choices[0].message.content
    } catch {
        Write-Error "Error in communication with GPT-4o: $_"
        return "An error occurred while querying the Assistant."
    }
}

# Ensure a PDF path was provided
if (-not $pdfPath) {
    Write-Host "No PDF file path provided. Please provide a valid PDF file path with the -pdf parameter."
    exit
}

# Upload the PDF to OpenAI
$fileId = Upload-PDF $pdfPath

# Ensure the PDF was uploaded successfully
if (-not $fileId) {
    Write-Host "Failed to upload the PDF file."
    exit
}

# Get user input for the email draft
$userInput = Read-Host "Rough email draft or message"

# Ensure user input isn't empty
if ([string]::IsNullOrWhiteSpace($userInput)) {
    Write-Host "Input cannot be empty. Please provide a message."
    exit
}

# Query GPT-4o Assistant with the email draft and uploaded PDF
$AiResponse = Invoke-AssistantGPT $userInput, $fileId

# Show response
Write-Output "`nResponse from GPT-4o:`n"
Write-Output $AiResponse

Read-Host "Press enter to Exit..."

Good luck to anyone interested in testing or developing it further.