Open pazevedo-hyland opened 1 month ago
@AstraBert , you have any clue what might be causing this?
This usually happens if there's a mismatch between the detected file type and the one you specify in DocumentBlock. For example, if the system detects your file as PLAIN_TEXT but you set the mimetype as PDF, BedrockConverse will throw a validation error. Make sure you explicitly set document_mimetype="application/pdf" when creating the DocumentBlock, like this:
DocumentBlock(path="./code_block.pdf", document_mimetype="application/pdf")
Also, DocumentBlock only works with text-based PDFs—it doesn't extract text from scanned/image-based PDFs, so if your PDF is a scan, it won't work as expected and may return empty content or fail to process the document at all. Only use text-based PDFs for this workflow. More details here: source.
To reply, just mention @dosu.
How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other
Hey there @pazevedo-hyland
I'll take a look!
Heya @pazevedo-hyland
From the way I see our code works, there is no easy fix for this. My workaround would be:
import json
from pydantic import BaseModel, Field
from llama_index.core.llms import ChatMessage, DocumentBlock, MessageRole, TextBlock
from llama_index.llms.bedrock_converse import BedrockConverse
class Summary(BaseModel):
summary: str = Field(
description="Summary of a given text/document"
)
llm = BedrockConverse(
model="anthropic.claude-3-haiku-20240307-v1:0",
)
structured_llm = llm.as_structured_llm(Summary)
def summarize_document(document_path: str) -> str:
response = llm.chat(
[
ChatMessage(
role=MessageRole.SYSTEM,
content="You are a document text extraction assistant, you extract the whole text of the documents you are provided with."
),
ChatMessage(
role=MessageRole.USER,
blocks=[
TextBlock(text=f"Can you extract the text from the attached document?"),
DocumentBlock(path=document_path)
]),
]
)
full_text = response.message.blocks[0].text
print(full_text)
response = structured_llm.chat(
[
ChatMessage(role="user", content=f"Please summarize this text:\n\n'''\n{full_text}\n'''")
],
tool_required = True
)
summary = json.loads(response.message.blocks[0].text)["summary"]
return summary
if __name__ == "__main__":
summary = summarize_document(document_path="ai_turns_nuclear.pdf")
print(summary)
I know this is not optimal because it might be resource-intensive, but for now it seems a good option. From the document attached to this issue, I got:
FULL TEXT:
Here is the full text extracted from the document:
Will nuclear power satiate AI energy hunger?
AI, data and energy: an introduction
November 2022 changed the life of humans forever: the world of Artificial Intelligence, that had been operating for years out of the spotlight, finally came to the limelights with OpenAI's ChatGPT, a chat interface that leveraged a Large Language Model (GPT-3) to generate responses to the humans it interacted with. The excitement around AI exited then for the first time the scientific community, reaching also the business world: in almost two years, investments and revenues in the field rocketed, with big and small companies pushing the revolution further, testing the limits of our technologies.
In less than two years, from GPT-3 to Llama-3, the data volumes for AI went up from 10^11 to 10^13 training tokens, and this data hunger, combined with the need for computational power, will drive the increase in data centers' energy demand to almost a double its current size in 2030. ronmental costs of Artificial Intelligence are pretty much obscure, due to non-disclosure policies of the companies that build the most of it, but the path is clear: its power needs will be huge, and the consequences on the electrical consumption will be very relevant.
The question now is: how will we be able to power this revolution without worsening the already dramatic climate crisis we're going through?
Understanding the problem: some key facts
1. AI companies are investing in more powerful hardwares
Following Beth Kindig's steps on Forbes, we can see that hardware-producing companies, such as NVIDIA, AMD and Intel, are putting money into more and more powerful chips, able to manage larger data volumes in a fast and efficient way, but with increased power requirements:
• Up to now, the two most powerful NVIDIA GPU hardwares, A100 and H100, consume respectively 250W/chip and 300 to 700W/chip when brought to the maximum power. The next generation GPUs, Blackwell's series B200 and GB200, will be able to run at 1200 and 2700W/chip, with a 4-fold increase in their power consumption
• AMD's most powerful GPU hardware, MI300x, consumes 750W/chip, up
================
SUMMARY
The text discusses the growing energy demands of artificial intelligence (AI) as the technology advances. It highlights the increasing power consumption of the latest AI hardware, such as NVIDIA's A100 and H100 GPUs, which can consume up to 700W and 2700W per chip, respectively. The article suggests that the environmental costs of AI are not well understood due to non-disclosure policies of companies, but the path is clear - the power needs of AI will be huge, and the consequences on electrical consumption will be very relevant. The key question raised is how we can power this AI revolution without worsening the climate crisis.
================
Bug Description
DocumentBlock doesn't seem to work when we try to use llm as structured llm
Version
0.12.41 llama-index | llama-index-llms-bedrock-converse 0.7.1
Steps to Reproduce
Relevant Logs/Tracbacks