EleutherAI / gpt-neox

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
https://www.eleuther.ai/
Apache License 2.0
6.94k stars 1.01k forks source link

[Question] about summarization tasks #943

Closed phamkhactu closed 1 year ago

phamkhactu commented 1 year ago

Hi, Thanks for great repo

I found the tutorial about summary task, but I can not find out any information. I'm very happy if you can give me guide or doc for summary task

Many thanks.

Quentin-Anthony commented 1 year ago

I'm very confused on what you're asking. Can you give specifics on the summary tutorial you're using and what you need?

phamkhactu commented 1 year ago

Hi @Quentin-Anthony,

I'm so sorry for making you feel confuse. yeah, I want to summary document. If I give model a doc ==> model can summary main ideas in doc

Here is what I want For example: Giving a content text:

Give me summary about this content behind

Many businesses (OpenAI, AI21, CoHere, etc.) are providing LLMs as a service, given their attractive potential in commercial, scientific, and financial contexts. While GPT-4 and other LLMs have demonstrated record-breaking performance on tasks like question answering, their use in high-throughput applications can be prohibitively expensive. FOR INSTANCE, using GPT-4 to assist with customer service can cost a small business over $21,000 monthly, and ChatGPT is predicted to cost over $700,000 daily. The use of the largest LLMs has a high monetary price tag and has serious negative effects on the environment and society.

Studies show that many LLMs are accessible via APIs at a wide range of pricing. There are normally three parts to the cost of using an LLM API:

The prompt cost (which scales with the duration of the prompt)
The generation cost (which scales with the length of the generation)
A fixed cost per question.
Given the wide range in price and quality, it can be difficult for practitioners to decide how to use all available LLM tools best. Furthermore, relying on a single API provider is not dependable if service is interrupted, as could happen in the event of unexpectedly high demand.

🚀 JOIN the fastest ML Subreddit Community
The limitations of LLM are not considered by current model ensemble paradigms like model cascade and FrugalML, which were developed for prediction tasks with a fixed set of labels. 

Recent research by Stanford University proposes a concept for a budget-friendly framework called FrugalGPT, that takes advantage of LLM APIs to handle natural language queries.

Prompt adaptation, LLM approximation, and LLM cascade are the three primary approaches to cost reduction. To save expenses, the prompt adaptation investigates methods of determining which prompts are most efficient. By approximating a complex and high-priced LLM, simpler and more cost-effective alternatives that perform as well as the original can be developed. The key idea of the LLM cascade is to select the appropriate LLM APIs for various queries dynamically. 

A basic version of FrugalGPT built on the LLM cascade is implemented and evaluated to show the potential of these ideas. FrugalGPT learns, for each dataset and task, how to adaptively triage questions from the dataset to various combinations of LLMs, such as ChatGPT, GPT-3, and GPT-4. Compared to the best individual LLM API, FrugalGPT saves up to 98% of the inference cost while maintaining the same performance on the downstream task. FrugalGPT, on the other hand, can yield a performance boost of up to 4% for the same price. 

FrugalGPT’s LLM cascade technique requires labeled examples to be trained. In addition, the training and test examples should have the same or a similar distribution for the cascade to be effective. In addition, time and energy are needed to master the LLM cascade.

FrugalGPT seeks a balance between performance and cost, but other factors, including latency, fairness, privacy, and environmental impact, are more important in practice. The team believes that future studies should focus on including these features in optimization approaches without sacrificing performance or cost-effectiveness. The uncertainty of LLM-generated results also needs to be carefully quantified for use in risk-critical applications. 

My expected output:

The content highlights the growing popularity of large language models (LLMs) offered as a service by companies like OpenAI, AI21, and CoHere. While LLMs such as GPT-4 have shown remarkable performance in tasks like question answering, their use in high-throughput applications can be excessively expensive. For example, employing GPT-4 for customer service could cost a small business over $21,000 per month, and the predicted cost for ChatGPT is over $700,000 per day. These high costs have negative implications for the environment and society.
Quentin-Anthony commented 1 year ago

This is a general DL question that's unrelated to the development of this library. I recommend that you read the HuggingFace docs, likely: https://huggingface.co/docs/transformers/tasks/summarization