GoogleCloudPlatform / generative-ai

Sample code and notebooks for Generative AI on Google Cloud, with Gemini on Vertex AI
https://cloud.google.com/vertex-ai/docs/generative-ai/learn/overview
Apache License 2.0
6.79k stars 1.82k forks source link

How to run mutiple text model requests to summarize files in parallel #62

Closed Julian-Cao closed 1 year ago

Julian-Cao commented 1 year ago

https://github.com/GoogleCloudPlatform/generative-ai/blob/main/language/examples/document-summarization/summarization_large_documents.ipynb

Read the PDF file and create a list of pages

reader = PyPDF2.PdfReader(pdf_file) pages = reader.pages

Create an empty list to store the summaries

initial_summary = []

Iterate over the pages and generate a summary for each page

for page in tqdm(pages):

# Extract the text from the page and remove any leading or trailing whitespace
text = page.extract_text().strip()

# Create a prompt for the model using the extracted text and a prompt template
prompt = initial_prompt_template.format(text=text)

# Generate a summary using the model and the prompt
summary = model_with_limit_and_backoff(prompt=prompt, max_output_tokens=1024).text

# Append the summary to the list of summaries
initial_summary.append(summary)

it should be a sync code to run one after one. how to make it in parallel?

iamthuya commented 1 year ago

I tried using multi-threading to do this. It didn't help much because there is a limit in API call rate currently.

Julian-Cao commented 1 year ago

It should be 60 requests per minute using the text-bison model. why does it so easily reach the limit? could you please share your code snippet?

iamthuya commented 1 year ago

Just checked with the team internally. They will release an official way to do it soon. We should just wait for it.