polyrabbit / hacker-news-digest

:newspaper: Let ChatGPT Summarize Hacker News for You
http://hackernews.betacat.io/
GNU Lesser General Public License v3.0
668 stars 87 forks source link

Incomplete news summary #19

Closed MstMoonshine closed 1 year ago

MstMoonshine commented 1 year ago

For some news, the summary is not shown completely, and there is no link provided to check out the full summary. Is it that the summary does end with an ellipsis or that the summary is not rendered correctly?

Example:

1. Example 1

2. Example 2

polyrabbit commented 1 year ago

@MstMoonshine It's done on purpose - an ellipsis means the summary is truncated to 400 characters.

Sometimes, OpenAI just returns a long text, making it barely a summary. So I add a soft limit in the prompt and a hard limit in the rendered HTML to keep the page clean.

MstMoonshine commented 1 year ago

I see in news.py you make GPT to summarize the content in 2 sentences. Is that the soft limit you mentioned? What about changing the prompt to something like “summarize the content within 200 words.” ?

polyrabbit commented 1 year ago

Right, it seems LLM doesnot follow length instruction precisely. I have tried prompt like "xx words / xx characters", and found "2 sentences" is the closest one to my length limitation.

Or do you have any working experience on how to hard limit the length in the prompt?

MstMoonshine commented 1 year ago

I don't really think there is a way to hard limit GPT's output. I was using "The following contents are (a part of) a webpage. Please summarize it within 150 words.", which works for most cases. Perhaps you could consider AI chain/pipeline/composition: If the response exceeds the hard limit you want, pipe it to another GPT thread to further digest the content.

polyrabbit commented 1 year ago

@MstMoonshine Hi, just added a simple and quick solution - full summary can be seen from the tooltips, if it's truncated on the page. Hope it helps.

image
MstMoonshine commented 1 year ago

That certainly helps! But that leads to another problem.

I use an RSS reader for your website. The RSS reader preview shows the truncated version. So if I want to read the full version, I have to go for the webpage. That is fine. But the problem is, there is currently no link to your page in the RSS feed! (Currently, there is one link for the news website and another for the HN comment) As a result, I have to find your website manually and search for the article of interest manually then hover the mouse over the paragraph.

Consider adding a link to the RSS feed, which links to the corresponding summary paragraph. That definitely helps!

polyrabbit commented 1 year ago

OK, for RSS readers, I suppose it's OK to output the full summary.

But I also find some RSS readers doesnot update content even when the source has updated - this happens a lot when upvotes of a news raises to a certain threshold, and I'll switch to ChatGPT to get a better summary.

For this problem, I added a [summary] link in the RSS content to get user back to the webpage.

image
MstMoonshine commented 1 year ago

Great. The project's been a useful daily tool already.