microsoft / genaiscript

Automatable GenAI Scripting
https://microsoft.github.io/genaiscript/
MIT License
1.97k stars 110 forks source link

AICI returns empty string when asked to extract quotes from a pdf #381

Open bzorn opened 6 months ago

bzorn commented 6 months ago

I have the following script (also in samples extract-quotes.genai.js). Am I misunderstanding how to make the rest of the prompt (including the FILE def interact with the AICI.gen call?

script({
    title: "extract-quotes-AICI",
    model: "aici:mixtral",
    system: [],
})

// use def to emit LLM variables

const { file, pages } = await parsers.PDF(env.files[0])

def("FILE", file.content)

// use $ to output formatted text to the prompt
$`You are a helpful assistant.
Given the FILE, extract three exact quotes from the document
that could be used in a 1 paragraph press release that
would both give the audience an understanding of the
idea and make them want to learn more.

Quote 1: ${AICI.gen({ substring: file.content, maxTokens: 1000, storeVar: "quote1" })}
Quote 2: ${AICI.gen({ substring: file.content, maxTokens: 1000, storeVar: "quote2" })}
Quote 3: ${AICI.gen({ substring: file.content, maxTokens: 1000, storeVar: "quote3" })}

I ran this on the file in samples/src/bztest/test1.pdf and the output was:

You are a helpful assistant. Given the FILE, extract three exact quotes from the document that could be used in a 1 paragraph press release that would both give the audience an understanding of the idea and make them want to learn more.

Quote 1: " Quote 2: " Quote 3: "

The AICI args were:

FIXED "You‧ are‧ a‧ helpful‧ assistant‧.‧\n‧G‧iven‧ the‧ FILE‧,‧ extract‧ three‧ exact‧ quotes‧ from‧ the‧ document‧\n‧that‧ could‧ be‧ used‧ in‧ a‧ ‧1‧ paragraph‧ press‧ release‧ that‧\n‧w‧ould‧ both‧ give‧ the‧ audience‧ an‧ understanding‧ of‧ the‧\n‧idea‧ and‧ make‧ them‧ want‧ to‧ learn‧ more‧.‧\n‧\n‧Qu‧ote‧ ‧1‧:‧ "

GEN " \""
FIXED "\n‧Qu‧ote‧ ‧2‧:‧ "

GEN " \""
FIXED "\n‧Qu‧ote‧ ‧3‧:‧ "

GEN " \""
FIXED "\n"
JsCtrl: done
bzorn commented 6 months ago

More info. When I apply the same prompt to a .md file, I get actual quotes from the document (not the empty ones as above).