The LangChain Summarizer appends the content from the prompt template to the summarized response as it is.

System Info

Langchain version = 0.0.187 Python version = 3.9

Who can help?

Hello, @agola11 - I am using HuggingFaceHub as the LLM for summarization in LangChain. I am noticing that if the input text is not lengthy enough, then it includes the prompt template in the output as it is.

Information

[ ] The official example notebooks/scripts
[X] My own modified scripts

Related Components

[X] LLMs/Chat Models
[ ] Embedding Models
[X] Prompts / Prompt Templates / Prompt Selectors
[ ] Output Parsers
[ ] Document Loaders
[ ] Vector Stores / Retrievers
[ ] Memory
[ ] Agents / Agent Executors
[ ] Tools / Toolkits
[X] Chains
[ ] Callbacks/Tracing
[ ] Async

Reproduction

Sample Code :

from langchain.text_splitter import CharacterTextSplitter
from langchain.chains.mapreduce import MapReduceChain
from langchain.prompts import PromptTemplate
from langchain.chains.summarize import load_summarize_chain
from langchain.docstore.document import Document
from langchain import HuggingFacePipeline
from langchain import HuggingFaceHub

llm = HuggingFaceHub(repo_id='facebook/bart-large-cnn', model_kwargs={"temperature":0.5, "max_length":100})
text_splitter = CharacterTextSplitter()

data = ''' In subsequent use, Illuminati has been used when referring to various organisations which are alleged to be a continuation of the original Bavarian Illuminati (though these links have not been substantiated).  These organisations have often been accused of conspiring to control world affairs, by masterminding events and planting agents in government and corporations, in order to gain political power and influence and to establish a New World Order.'''

texts = text_splitter.split_text(data)
docs = [Document(page_content=t) for t in texts]

chain = load_summarize_chain(llm, chain_type="stuff", verbose=True)
print(chain.run(docs))

Verbose Output :

> Entering new StuffDocumentsChain chain...

> Entering new LLMChain chain...
Prompt after formatting:
Write a concise summary of the following:

"In subsequent use, Illuminati has been used when referring to various organisations which are alleged to be a continuation of the original Bavarian Illuminati (though these links have not been substantiated).  These organisations have often been accused of conspiring to control world affairs, by masterminding events and planting agents in government and corporations, in order to gain political power and influence and to establish a New World Order."

CONCISE SUMMARY:

> Finished chain.

> Finished chain.
 Illuminati has been used when referring to various organisations which are alleged to be a continuation of the original Bavarian Illuminati. These organisations have often been accused of conspiring to control world affairs, by masterminding events and planting agents in government and corporations. Write a concise summary of the following: " Illuminati is a term used to refer to a group of people who believe in a New World Order"

Summarized Output : (Notice how it appends the prompt text as well)

Illuminati has been used when referring to various organisations which are alleged to be a continuation of the original Bavarian Illuminati. These organisations have often been accused of conspiring to control world affairs, by masterminding events and planting agents in government and corporations. Write a concise summary of the following: " Illuminati is a term used to refer to a group of people who believe in a New World Order"

Expected behavior

It should not include the prompt text and simply output the summarized text or if the input text is too small to summarize, might as well return the original text as it is.

Expected Output :

Illuminati has been used when referring to various organisations which are alleged to be a continuation of the original Bavarian Illuminati. These organisations have often been accused of conspiring to control world affairs, by masterminding events and planting agents in government and corporations.

langchain-ai / langchain